VSnake notes: Оптимизация: 41 день превратился в 20 часов

2013-12-25

Оптимизация: 41 день превратился в 20 часов

Дэйв Боуман рассказывает, как он оптимизировал вычисление координат:

We are doing a project for a state DOT where we needed to get the location of photos taken along the roadway... And, they have these photos taken every year, so in total, we needed to locate roughly 24 million of these photos for this project.

…

Through some complex SQL machinations that I will skip here, we were able to get the route and mile post (measure) for each photo, into a set of tables in a SQL Server database. The tables also had empty columns for x,y in Web Mercator and Lat, Long, which we would populate by looking up these values on the Linear Reference System (LRS) feature class, based on the route and mile post. Simple!

…

A few years ago we wrote an ArcGIS Server REST Server Object Extension (SOE) using .NET and ArcObjects, which would return the x,y (in web mercator) from an LRS feature class, based on a specified route and mile post.

…

However, some initial testing showed that each of these lookups via the REST SOE were taking ~ 150ms. This a lightning fast when doing a single lookup (you can check out this page which uses another version the same function), but in a tight loop, it was looking like 41 days of processing time.

…

The first thing I did was decide to ditch the REST SOE. There is no question that there is a ton of overhead in running this sort of a process through an SOE.

…

Thus, I fired up Visual Studio, and created an ArcGIS Console Application using the templates provided with the ArcGIS desktop SDK. From there, I dumped in the ArcObjects code from the SOE, and started scaffolding up code around it. For those interested, the core code for this is in a gist at the end of the post…

…

My initial tests on a batch of 10,000 photos ran in a few minutes, with an average lookup time of 5ms! Problem solved! Woot!

…

Then I unleashed it on a test batch of 100,000 records. Blammo! About 12,000 records in, an OutOfMemoryException is thrown. Darn. And I had not even gotten to writing out the output text file!

…

The solution to this was to run the processing in batches. At the beginning of a batch, I’d open the feature class, run through the batch, and then close it.

…

By this time I’d re-run the same 10,000 and 100,000 record test datasets at least 50 times, and this seemed to be a waste of time. Why not use my test runs to process real data? In my refactoring, I also wanted to ensure that the process could fail at any time, and re-starting things would pick up where it left off – zero wasted cycles.

…

so I set two of these processes running on Friday evening and went to bed pretty pleased with myself. Saturday morning, I found that, with less than 1 million records processed, both processes crashed due to SQL time outs. At least I was glad I’d built in that fault tolerance so that the ~900,000 records which completed and were stored would not have to be re-run.

I extended the timeouts in Massive, and started them back up, but noticed that now, the average time for a route & mile post lookup was ~30ms. What?

…

I suddenly had an idea what the issue was. The variation in the look up time was likely because some routes features are longer than others. And longer routes have more vertices, which simply take more time to be read from disk. Not only that, in the code, I was searching for the route feature on every single iteration, so we would see this tiny variation in performance magnified.

What if, when I did that SELECT TOP query, I made sure that every record I got back was on the same route? That way, I would only have to search for the route feature once per batch, and the tight loop over the batch would be working with the same geometry, simply looking up the point along that feature at the specified measure value.

This took a little refactoring, and some SQL indexes, but it had an amazing effect. As I’m typing this, the process is now running rock solid at ~3ms per look up, which is roughly 1.2 million lookups per hour.

…

http://blog.davebouwman.com/2013/05/06/optimizing-bulk-lrs-processing-41-days-down-to-20-hours/

https://gist.github.com/dbouwman/5523346

https://github.com/robconery/massive

Знакомо, да. Особенно то раздражение, которое возникает, когда запускаешь процесс, длящийся сутки-другие, и уходишь. А когда возвращаешься, оказывается, что процесс отвалился через 10 минут после того как ты ушел.

Отладка, тщательность и устойчивость к сбоям — наше все.

original post http://vasnake.blogspot.com/2013/12/41-20.html

Tools

VSnake notes

2013-12-25

Оптимизация: 41 день превратился в 20 часов

Комментариев нет:

Отправить комментарий

Архив блога

Ярлыки

Обо мне

Links