It wasn't me. You can't prove anything.


2010-09-09

Python crash

I ^gasp^ had to reboot my Linux box this morning after running a python script.

Here is the scenario.

I wrote a script in Python that parses a web server log file. We are trying to connect the dots on how people come in to the web page verses where they end up. This is needed by every single company trying to use the internet for marketing and sales. That basically means every one. This is a normal thing for companies to look for.

The log files add up to 7.8 million lines. Each line is about four feet long. The combined size of the logs was something like 3.6 gigabytes. My simple script walks the logs, gathers some basic information about entering the page and where people ended up. It then zippers the two and boils it all down to about [redacted] "hits". These are specific tacks that tell us if we are getting our message out there to people who find our page.

I ran the script. My machine started running slower and slower. I could tell the machine was chugging on memory. The CPU was hardly being used. The comparisons are very simplistic in the script. The script kept running. The machine all but ground to a halt. The script finished. The computer did not come back. I was able to copy the table I needed off the screen and email it to the pertinent people. It took ten times longer, but things worked. Still, I needed to reboot the computer to get my performance back.

The script had walled off a chunk of memory and for some reason, Python was not giving it up. The drive was chugging and the screen was drawing point by point. This kind of thing happens all the time on Windows. I'm not used to seeing it on Linux. I was able to recover and reboot without a real crash. I didn't loose any data. It was "my" script that brought it to it's knees.

I did not design the script at first to be memory friendly. It was not difficult to change the script to act a bit better on this front. After rebooting and running again, the machine did not even feel it.

No comments: