Actually, this is a better list to look at.. Raw gprof info is hard to sort through..
Heres the flat profile..
Code:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls ms/call ms/call name
49.33 1.11 1.11 25523 0.04 0.08 EntityList::AICheckCloseAr
rgo(Mob*, float, float)
22.67 1.62 0.51 18274884 0.00 0.00 Mob::Dist(Mob const&)
3.56 1.70 0.08 81363398 0.00 0.00 LinkedListIterator<Entity*
>::GetData()
3.33 1.77 0.07 25042202 0.00 0.00 LinkedListIterator<Entity*
>::Advance()
2.89 1.84 0.07 25075944 0.00 0.00 LinkedListIterator<Entity*
>::MoreElements()
2.67 1.90 0.06 20299491 0.00 0.00 Entity::IsClient()
1.78 1.94 0.04 18300531 0.00 0.00 Entity::IsCorpse()
1.56 1.98 0.04 19729699 0.00 0.00 Mob::IsMob()
1.11 2.00 0.03 19098228 0.00 0.00 Entity::CastToMob()
1.11 2.02 0.03 sqrtf
This is total execution time metrics for each indivudual function.. Normally flat views arent very useful, but in our case they are.. All the expensive operations come from the same method, AICheckA
Now, what I posted before was a different view.
Here are the column names
index % time self children called name
and the descriptions..
Quote:
% the percentage of the total running time of the
time program used by this function.
cumulative a running sum of the number of seconds accounted
seconds for by this function and those listed above it.
self the number of seconds accounted for by this
seconds function alone. This is the major sort for this
listing.
calls the number of times this function was invoked, if
this function is profiled, else blank.
self the average number of milliseconds spent in this
ms/call function per call, if this function is profiled,
else blank.
total the average number of milliseconds spent in this
ms/call function and its descendents per call, if this
function is profiled, else blank.
name the name of the function. This is the minor sort
|
What I pasted in the last message was basiclly a performance call stack for AICheck...
so, the 1.11 indicates the # of seconds, not percentages, and .51 (Dist) , .07 (Iterator), .06 (Iterator) should all together add up to 1.11 , so you're not adding percentages, you should be adding seconds... Suffice too say, the iterator operators add up
AICheckClose took 1.11 seconds, and its call tree involved
Dist (.51), roughly 1/2, then the linked list ops, 0.07, .06, .05
~.2 .. Still heavy.. Then all the calls too IsClient, IsCorpse, IsMob, add up to the total time of AICheck..
Each time we run through the iterator, we incur the costs of Iterator.Advance, GetData, etc, and additionally running the IsChecks.
AICheck makes up 50% of the CPU cycles in zone, and this is just breaking that down into whats heaviest in it.
You can look at the entire profile
http://denial.dyndns.org/hma/eqemu/zone.profout
Obviously, the dist check is the big hitter... But behind them comes the time spent in iterator operations.
One thing is we're using a doubly linked list, even though we only do forward operations from what I can tell. We could probably reduce this to a singularly linked list to speed things up, unless we can identify cases when we want to go forward then backwards.