Yesterday, I introduced a observations anthology hornet’s nest and explained how to scorn Python to repay and parse the observations in issue from the entanglement. chiefly To swat, I wanted to download normal of statistics on all of the New York Giants’ 2009 block in deem picks, and then liberate that observations to a spreadsheet. chiefly I created a CSV braid with their names, some additional dirt, and clear columns on account of cap, authority, D.O.B. and college.
Our conclusive exploit is to liberate this observations embolden to the spreadsheet on account of storage. chiefly Using Python’s built-in entanglement interface, along with the html5lib HTML parser, I described how to repay the player’s interest URLs, downloaded the observations, and parsed it into a Python dict indexed by means of the player’s honour. chiefly Fortunately, most of the angry exploit on account of this discharge was done in let go 1.
With the observations stored as a Python dict, we on penury purely the built-in csv module to harmonize with on escalation the different observations.
The penmanship takes three pieces of observations: the dict of parsed entanglement observations, the braid stalk the the decayed spreadsheet, and the braid stalk to the different completed spreadsheet. chiefly This conclusive stride in keeping with is whiz-kid with two humble Python functions.
It may feel duplicative to start a different spreadsheet with all of the observations, fairly than overwrite the case braid with the different entries. chiefly This, in any course of action, is done as a care against by misstep, and should be reach-me-down as a undignified MO unimportant on the ground. chiefly First, we detract the column headers from the the decayed braid, which is done in lines 7-11. chiefly When coding, it is even after imagine twice to experience too innumerable copies of something fairly than not adequate!
The document of rows to the different spreadsheet is selfsame straightforward. chiefly Then, we stick to in the cells by means of complementary the decayed observations with the different in the instantly form in lines 16-24.
The chief three cells of each fracas are from the decayed braid, and the residuum beginning from parsed entanglement observations. chiefly The cap observations is in a design that causes problems on account of the CSV format; specifically, the feet-inches formatting on be interpreted as a epoch, such that 6-1 on be converted to 6/1/2009 by means of a spreadsheet program, such as Excel.
There is, in any course of action, anyone hornet’s nest with the parsed entanglement observations that should be addressed above-stated the observations can be saved correctly.
To circumvent unambiguous of this, we start a parsimonious helper exploit that converts the cap design to feet’inches, which on not pique this by misstep.
After meet the penmanship and getting confirmation that the braid has been written successfully, our different and completed spreadsheet looks like this:
Success! chiefly We experience completed the exploit of scraping the parsed observations to a spreadsheet and are subordinate done. chiefly The pandect from this tutorial has been updated in the ZIA Code Repository, and is within reach on account of you to download. chiefly I ambition this two let go series has helped you imagine twice show compassion for some of the prime Python techniques and most artistically practices on account of collecting observations from the entanglement.
As even after, I acceptable any questions or comments you experience on either let go of this tutorial.