I wouldn't mind putting some criteria together, but I need a few days to mull it over. Caryatis is spot on when he says get some consensus as to the best way to parse, consensus now would make accepting potential changes a lot easier. I wouldn't mind taking a look at live first to get some idea of baselines.
10,000 lines of combat sounds like a lot, but it is a reasonable figure to get a solid dataset. Getting 10,000 lines takes significantly less time than leveling up a char.
Any parses on live should be matched with identical parses on 0.8.0.
As far as what AC brackets to use, there should be a bell curve distribution of hits across 20 different damage values. As AC and PC level increases the bulk of hits will shift to the lower end of that 20, and as mob level increases the hits will shift to the upper end. To replicate the live formula's closely you need to identify the lower and upper end of that AC bracket at each level interval (5 level intervals seems reasonable), and parse 10-20 AC intervals (i.e 10-20 discrete AC values, not Starting AC + 20, Starting AC + 40 etc) in between. It's also necessary to understand the impact of PC level vs NPC level
as a ratio rather than simply 5 levels +/-.
FWIW this is a pretty big task, once some criteria are established I imagine it would be more sensible to break the task up between more than one person if there are other volunteers
Plus if people have chars at various levels on live it saves a lot of leveling up time.