- Data Import Huddle – small groups finding solutions
Dev Group April Meeting Notes
News and problem clinic were as normal with Jason pointing out www.delphimagazine.com, a listing of Delphi posts from around the web. His comment: “BeginEnd.net is probably still better” …
There have been new releases of UniGUI and TMSWebcore, which were mentioned, and positive reviews of using VisualCode with OMNIPascal as a useful lightweight code-reading tool.
The day was a “Data Importation Huddle” run by Adam
Two fairly large Excel files had been provided containing data. The files had multiple sheets, and some human error in the data input, with a total of about 20,000 rows of data it was a non-trivial problem.
Members were given 2 hours to try to get the data into a state where they could answer some basic questions. About 5 members had “done their homework” and had a go at solving the problem prior to the event. The group split into 6 or 7 huddles, with a couple of people trying to go it alone, while others worked in groups of up to 4.
There was a good work-rate, only interrupted by lunch and the occasional cup of tea with some chocolate biscuits.
The day was marred somewhat by the organizer messing up his final presentation fairly horribly, but the solutions presented by 3 members from their individual huddles were genuinely interesting and informative. DavidC used Visual Studio with a downloaded Excel plug-in and LINQ based object-data-queries to answer the questions, MartinH used the Flexcel components in Delphi, and MarkJ used his own scripting language, which is an add-on to the Advantage Database.
Mark pushed the Excel files out to CSV prior to import, the other groups all tried to export direct from Excel. DavidC & Martin both managed to do this, but other huddles on the day had some problems as the sheets contained some spurious blank lines and mis-coded data which caused problems, and in the short time available it was hard to write testing code to check the values of imported rows were valid.
A key advantage David’s method of importing the whole dataset into an Object List in Visual studio was the fact that he could easily add validity testing to the “Add” method of the List, and thereby reject spurious rows of data.
The day felt like a good experiment with a new format for training, but since it was my idea I should let members feedback and confirm this! It showed how hard it is to judge the difficulty of a coding problem, as different groups stumbled at different points in the importation process for different reasons. However, since several people did complete the challenge, I think it was not too badly judged.
If we do something similar in the future, I would encourage some sharing during the day between the different huddles as people go on with the challenge, so that pit-falls one group has found could be avoided by others.
If I do something similar in future, I promise to prepare my final summing up a bit better, as they say: “It worked fine earlier” OUCH!