|Article: Electronic||Free Download|
A brief history of how I started to create Perl modules for data quality.
A run through of the design, and functionality of modules for parsing of person's name. A description of creating a formal grammar with Parse-RecDescent. Using regression testing to validate changes.
How hard was it to develop CPAN modules:
Amount of time needed to create modules and keep them up to date. Dealing with complexity.
Building on the work of others:
How to find the best CPAN modules that can save you reinventing the wheel. Researching current designs and algorithms
Making sure it works:
Building up a base of users who can assist with testing and supplying sample data. Handling requests for bug fixes and enhancements. Keeping control of the scope
How does it stack up against commercial software: compare features and accuracy of Perl modules to their commercial equivalents
What's next: Other data quality modules that still need to be developed, GUI interfaces, integration with other tools, data warehousing, ETL
|Keywords:||Perl, CPAN, Data Quality, Parsing, Software Design|
Article: Electronic (PDF File; 503.063KB). Published by The Open Source Developers' Conference Papers.