söndag, maj 27, 2012

Analyzing logfiles with Power Shell

The other day I was tasked with sumarizing statistics about a web application running on IIS version 6.5. This particular application was to be migrated to a different platform and thus some questions about which users and how frequent this application was being used came up. There were about 100 logfiles and a couple of hundred megabytes in total and they all had the same columns and so I started of by concatenating all the files and at the same time dropped any extra headings. Continued from there by trying to aggregate by date, user and nummer of access'. This did not turn out well however, the logic was alright but the Power Shell execution environment was not good enough at this kind of task. Power Shell took all of my RAM, about 3GB, but returned no results. I can see why as it has to process all of the data to be able to yield any trustworthy results (compare this task to distinct, group by or order by in TSQL) but I think the performance shown is really poor, isn't Power Shell supposed to be able to handle this amount of data? What in the end did work was; aggregate on the same level but per file, then concatenate and aggregate once more. I should have just imported the lot into SQL Server.

fredag, maj 04, 2012

SQL Server, troubleshooting locks

Today was the end of a fairly exhausting work week of which most consisted of what seemed to be a deadlocked query but turned out to be something entirely different altogether. The symptom was that the SSIS package executed for what seemed like an eternity. After some investigation it turned out that the package itself consumed no CPU and no IO -it was waiting. The SPID in SQL Server consumed no IO but some CPU. There were some locks related to the transaction initiated by this SPID which by me seemed to be the problem but we could not verify this as we lacked access on the server. We made an effort in disabling the transaction in the SSIS package which rendered the exact same symptom and this was the point where we noticed that the transaction was not a part of the problem at all. After spending some more time tracing and analysing what was actually going on on the server the root cause was found; a badly written query consisting in part of a WHERE NOT EXIST (SELECT ... clause. It turned out that the amount of data in the stage area in combination with a growing amount of data in the detail area together made the query useless. It took 16h+ to execute, after which we had to ask our DBA to kill it. The solution was simple enough; rewrite the query to use LEFT OUTER JOIN ... a ... WHERE a.col IS NULL. By using this technique instead the query finished in less than 2s.

Lesson learned; if it doesn't look like a deadlock then don't treat it as such.