The Salty Economist

Things I Should Have Learned in High School
posts - 56, comments - 0, trackbacks - 0

Threading and SyncLock

So I have a program that calculates the pay for contractors that deliver logs to sawmills.  Every week a given mill might receive 500 to 1000 loads.  For each load, the mill might have to make separate payments to the trucker, the logging contractor, the landowner, etc.  Let's say for this example that the mill receives 1,000 loads and must make payments to 6 separate vendors, for a total 6,000 transactions.  Calculating the payment on each transaction is not simple and requires side calculations to contend with things like overloaded trucks, volume conserions, and rate schedule lookups based on the species, product type, location, terrain. etc.

My program takes all the rate schedule information and marries it up to each load and calculates the transaction amount.  The proecessing time for a pay run of this size is about 15 minutes, or about .15 seconds per transaction (= .9 seconds per load).  Not a long time, except that when clerks make data entry mistakes and/or rate mistakes, the routine has to be run multiple times.  The time can really add.

The good thing about this problem is that the pay for one load is independent from another load.  This is where I got the grand idea of making the pay routine a multi-threaded adventure.  I figure if I have a quad-core machine, I could make use of the extra power and spin up to say 4 threads at a time and greatly reduce the calculation time.

  

The pseudo-code for the base calculation function goes something like this:

 Sub CalcPay

     Dim i as integer

     Dim ldRS as recordset

     ldRS = LoadRecordset("SELECT * FROM LOADS WHERE LOAD_DATE BETWEEN '2014-05-01' AND '2014-05-07'")  'Load up a recordset of loads 

     Do Until ldRS.EOF = False

          Dim aLoad as clsLoad

          aLoad = new clsLoad

          aLoad.Load(ldRS("ID").value) 'Load up the data from the load recordset into a clsLoad Object

          iRet = CalcLoadPay(aLoad)  'Calc the Load Pay

          ldRS.Movenext

     Loop

End Sub

Sub CalcPay(aLoad clsLoad)

     ...

     Do a Bunch of Calculations

     ....

     id = getNextIndex

     UpdateDatabaseTableForPay(id, aLoad)

End Function

Function getNextIndex() as Integer 

    Dim aRs as recordset

    Dim nextID as integer

    aRs = LoadRecordSet("SELECT NEXT_ID FROM TBL_INDICES WHERE MYINDEX = 'PAY_CALCS'")  'Get the Next ID

    nextID = aRS(0).Value

    sql = "UPDATE TBL_INDICES SET NEXT_ID = NEXT_ID + 1 WHERE MYINDEX = 'PAY_CALCS'"  'Update the Indices Table

    aConn.Execute

End Function

So, what we have here is a main calculator routine tht loops over the list of loads.  For each load in load up an object with all the data for the load.  The routine then calls a calculator that does all the pay calculations.  When the calcutions are done, it then writes the result to the database.  The write the the database takes three steps:

1.  Get the next transation ID.

2.  Update the tables that stores the transaction IDs (in this example, the table is named TBL_INDICES).

3.  Write out the pay data using the transaction ID as the primary key.

The program runs as is.  But, I want to make it run faster using a threaded approach.

The changes to the main routine are as follows:

Imports System.Threading

Sub CalcPay

     Dim i as integer

     Dim ldRS as recordset

     Dim parmams as clsParams 'See Below

     Dim loadCollection as Collection 'collection of loads currently being processed

     loadCollection = new Collection

     ldRS = LoadRecordset("SELECT * FROM LOADS WHERE LOAD_DATE BETWEEN '2014-05-01' AND '2014-05-07'")  'Load up a recordset of loads 

     Do Until ldRS.EOF = False

          Dim aLoad as clsLoad

          aLoad = new clsLoad

          aLoad.Load(ldRS("ID").value) 'Load up the data from the load recordset into a clsLoad Object

          payThread = New Thread(AddressOf ProcessLoad)

          params = new clsParams

          params.aLoad = aLoad  'The Load

          loadCollection.Add(aLoad.ID)  'Add load to collection

          parmas.loadCollection = loadCollection

          payThread.Start(params)

          Do Until threadColl.Count < 6  'Don't add another thread until there are less than 6 running

               Application.DoEvents()

          Loop

          ldRS.Movenext

      Loop

      Do Until threadColl.Count = 0  'Stall until done 

           Application.DoEvents()

      Loop

End Sub

Class clsParams

     Public aLoad as clsLoad

     Public loadCollection as collection

End class

Sub CalcPay(ldParams as clsParams)

     aLoad = ldParams.aLoad

     ...

     Do a Bunch of Calculations

     ....

     id = getNextIndex

     UpdateDatabaseTableForPay(id, aLoad)

     ldParams.loadCollection.Remove(aLoad.ID)  'Remove load from collection when done!

End Function

Things to note:

1.  I have created a collection of load IDs.  I add an ID every time I start a new thread.  I pass the collection along the thread.  When the thread finishes its work, it removes the load ID from the collection.  Yes, this works.  I then monitor the count of IDs in the collection and don't start a new thread until there are less than six.  Otherwise, I would end up spinning up a zillion threads and accomplish nothing.

2.  I can't pass more than one parameter to a thread.  Thus, I have to make a class (called clsParams).  I then add both my load object and my load collection to my class and then pass the class to the thread.  More work than I should have to do, but this too does work.

3.  So far so good.   Everything works for a bit and thing the program crashes and burns.  I get a bunch of errors for trying to insert transaction records with duplicate IDs.  I study the code for a while and realize that if 2 threads are real close in their timing that thread 1 can get a new ID.  And thread 2 can get the same ID IF thread 1 has NOT finished updating the table that stores the indices.

Here is the code again:

Function getNextIndex() as Integer 

    Dim aRs as recordset

    Dim nextID as integer

    aRs = LoadRecordSet("SELECT NEXT_ID FROM TBL_INDICES WHERE MYINDEX = 'PAY_CALCS'")  'Get the Next ID

    nextID = aRS(0).Value

    sql = "UPDATE TBL_INDICES SET NEXT_ID = NEXT_ID + 1 WHERE MYINDEX = 'PAY_CALCS'"  'Update the Indices Table

    aConn.Execute(sql)

End function

You can see that if the threads are close enough together that 2 transactions could get the same index.  The ordering would be:

1.  Thread 1:  Read the Next_ID

2.  Thread 2:  Read the Next_ID

3.  Thread 1:  Update the TBL_INDICES table.

4.  Thread 2:  Update the TBL_INDICES table.

What to do?

The first thing I tried was to wrap the 2 statements in a transaction, as so:This does NOT work!

Function getNextIndex() as Integer 

    Dim aRs as recordset

    Dim nextID as integer

    aConn.BeginTransaction

    aRs = LoadRecordSet("SELECT NEXT_ID FROM TBL_INDICES WHERE MYINDEX = 'PAY_CALCS'")  'Get the Next ID

    nextID = aRS(0).Value

    sql = "UPDATE TBL_INDICES SET NEXT_ID = NEXT_ID + 1 WHERE MYINDEX = 'PAY_CALCS'"  'Update the Indices Table

    aConn.Execute(sql)

    aConn.CommitTrans

End function

I had always thought that a transaction would lock the row for selects and updates/inserts.  But it does not.  It may stop and insert/update into the TBL_INDICES table while the transaction is open, but it does NOT block the select statement.

Thus, I am no better with my duplicates.

So, I Google some and find the syncLock statement.

Basically, the SyncLock statement says:  No other thread can execute this code until I am finished with it.  By the way, it can be any code, not just database reads and writes.

The structure of syncLock is:

SyncLock(someObject)

    ...

    Do a bunch of stuff

    ...

End SyncLock

The parameter "someObject" stores the state of the "locked code."  At first, I tried making it a local variable (see below)

Function getNextIndex() as Integer 

    Dim aRs as recordset

    Dim nextID as integer

    Dim someObject as New Object

    syncLock(someObject)

        aConn.BeginTransaction

        aRs = LoadRecordSet("SELECT NEXT_ID FROM TBL_INDICES WHERE MYINDEX = 'PAY_CALCS'")  'Get the Next ID

        nextID = aRS(0).Value

        sql = "UPDATE TBL_INDICES SET NEXT_ID = NEXT_ID + 1 WHERE MYINDEX = 'PAY_CALCS'"  'Update the Indices Table

        aConn.Execute(sql)

        aConn.CommitTrans

     End SyncLock

End function

This does NOT work.  Although the executing thread knows about someObject, the other threads do not.  So they just thumb their noses and plow on.

You have to make someObject a Global variable so that all the threads can see it.  Once I made this change, my program finally worked.

Its now time for a DeSchutes IPA

 

 

 

 

 

Print | posted on Saturday, May 10, 2014 3:58 PM |

Powered by:
Powered By Subtext Powered By ASP.NET