let somenumbers = [1..5]
let sum = Seq.fold (fun a b -> a + b) 0 somenumbers
In the above two lines I have substituted (fun a b -> a + b) for the (+).
What is this? It is an anonymous (unnamed) function that takes two arguments and returns their sum. Well, that's great, but how does it work with the method Seq.fold? Here's the trick. Seq.fold takes the list and works left to right and uses each value as a input value.
In this example, the first addition operation that is performed is to take the starting value of 0 as the fist parameter and the first item on the list (=1) as the second parameter. This, of course, returns the value of 1. The second addition operation that is performed involves taking the result of the first operation (=1) as the first parameter and taking the second item (=2) on the list as the second parameter. The result for the second operation is 3; this is now used as the first parameter for the third addition operation. The third item on the list (=3) is the second parameter. The result is now 6.
So the series of operations is:
0 + 1 = 1 -> + 2 = 3 -> + 3 = 6 -> + 4 = 10 -> + 5 = 15
In a sense, fold let's you recursively apply a function to a list (always from left to right) where the result from the function serves as the first input parameter to the next operation. And, because you need a seed or starting value, you always need to provide the first parameter for the first operation.
So, looking back at the statement:
return (Seq.fold (fun a b -> a + b) 0 asyncWorkflow) }
you can see that it is accumulating the results from the asyncWorkflow list. The asyncWorkflow is alist returned when we run all our functions in parallel. There will be one entry (the return value) per function in our group of async functions. The sum of the results are returned in the variable 'Results'.
Note that we still have executed anything yet because of the async wrapper. This means that the variable results is not an integer, for example, but an object of type Async<int>. It is just another set of instructions.
(3) We don't actually run anything until we hit the statement:
let R = Async.Run Result
The function Asysnc.Run actually sends our set of functions off to be executed. The function takes an argument of Async<??> and awaits the return of all the asynchronous functions to return.
So now we have to employ this methodology to our problem. Most of the work is done by a function I call runIndustryGroup. This function grabs all of the major company groups that a given industry group. As noted previously, I have 100 major industry groups. These are aggregated into 9 basic industry sectors, such as resources, construction, services, retail, durable manufacturing, etc.). Once I have all the major groups, I loop over them to build a list of input parameters that can used used to call my BuildIndustryIndex function. I then create an Async list of objects that is a collection of the input parameters and my BuildIndustryIndex function. Once the collection is built, I fire it off with an Async.Run command.
----------------------------------------------------------------------------------
let runIndustryGroup (industry_order : int) (sd : DateTime) (ed : DateTime) =
let conn = new System.Data.SqlClient.SqlConnection(connstr)
let iOpen = conn.Open()
let sql = "SELECT ID, GROUP_TYPE_ID, INDUSTRY_ORDER, MAJOR_ORDER, MINOR_ORDER, INDUSTRY_NAME, TICKER_ID FROM VL_INDUSTRIES WHERE INDUSTRY_ORDER = " + industry_order.ToString()
let cmd = new System.Data.SqlClient.SqlCommand(sql, conn)
let reader = cmd.ExecuteReader()
let idxParams = new List<indexParam>()
while reader.Read() do
idxParams.Add {id = reader.GetInt32(0); group_type_id=reader.GetInt32(1); industry_order=reader.GetInt32(2); major_order=reader.GetInt32(3); minor_order=reader.GetInt32(4); industry_name=reader.GetString(5); ticker_id=reader.GetInt32(6); sd = sd; ed = ed;}
()
let asyncList = new List<Async<int>>()
for industry in idxParams do
let (iret : Async<int>) = async {return BuildIndustryIndex industry.id industry.group_type_id industry.industry_order industry.major_order industry.minor_order industry.industry_name industry.ticker_id industry.sd industry.ed }
asyncList.Add iret
let Result =
async { let! asyncWorkflow = Async.Parallel asyncList
return (Seq.fold (fun a b -> a + b) 0 asyncWorkflow) }
let R = Async.Run Result
printf "Result = %i\n" R
----------------------------------------------------------------------------------
The first part of my code, collects all the major groups within a given industry and builds a list of input parameters. I first set up a record type called indexParam and then create a list called idxParams. The indexParam record type contains all the parameters I need for my BuildIndustryIndex function.
type indexParam = {id: int; group_type_id: int; industry_order: int; major_order: int; minor_order: int; industry_name: string; ticker_id: int; sd : DateTime; ed : DateTime;}
let idxParams = new List<indexParam>()
while reader.Read() do
idxParams.Add {id = reader.GetInt32(0); group_type_id=reader.GetInt32(1); industry_order=reader.GetInt32(2); major_order=reader.GetInt32(3); minor_order=reader.GetInt32(4); industry_name=reader.GetString(5); ticker_id=reader.GetInt32(6); sd = sd; ed = ed;}
My SQL Data reader is used to populate my list of parameters. In the next snippet:
for industry in idxParams do
let (iret : Async<int>) = async {return BuildIndustryIndex industry.id industry.group_type_id industry.industry_order industry.major_order industry.minor_order industry.industry_name industry.ticker_id industry.sd industry.ed }
asyncList.Add iret
I loop over my list of parameters and build by collection of Async<int> objects. Not that for each item in the idxParams list, I create an object like:
let (iret : Async<int>) = async {return BuildIndustryIndex ....}
and add it to my list asyncList.
asyncList.Add iret
Once I have my asyncList, I can fire it off with an Async.Run command.
let Result =
async { let! asyncWorkflow = Async.Parallel asyncList
return (Seq.fold (fun a b -> a + b) 0 asyncWorkflow) }
let R = Async.Run Result
printf "Result = %i\n" R
This code is in the exact same format as above. The return value 'Result' is the sum of the return values for each of the BuildIndustryIndex calls.
The very last part of the program is trivial. It is just a loop that runs over each of my major industry sectors.
So, at long last we can do some speed tests!
First, I'll run the program without the Async logic.
I ran it twice with the Async logic and got times of 3:07 and 2:51.
I ran it twice without the Async logic (meaning sequentially) and got times of 6:48 and 6:32.
This is pretty big.
I know this is not a rigorous test, but it sure gives me confidence that I can give my applications that require a slew of calculations a big boost!