I have a collection of objects IEnumerable<object> obs.
I have another collection of objects IEnumerable<object> data.
For each ob in obs I need to find the first item in data that has the same value in a certain property as ob. For example I could be looking for the first item in data that has the same ToString() value as ob. When the first item where the property values match is found, I do something with the found data item and then I check the next ob in obs. If none is found, I throw an error.
Here is a naive approach:
foreach (object ob in obs)
{
foreach (object dataOb in data)
if (ob.ToString() == dataOb.ToString())
{
... // do something with dataOb
goto ContinueOuter;
}
throw new Exception("No matching data found.");
ContinueOuter: ;
}
The disadvantage is that I calculate dataOb.ToString() every time, which is unnecessary.
I could cache it:
IDictionary<object, string> dataToDataStr = new Dictionary<object, string>();
foreach (object dataObj in data) // collect all ToString values in advance
dataToDataStr.Add(dataObj, dataObj.ToString());
foreach (object ob in obs)
{
foreach (object dataOb in dataToDataStr.Keys)
if (ob.ToString() == dataToDataStr[dataOb])
{
... // do something with dataOb
goto ContinueOuter;
}
throw new Exception("No matching data found.");
ContinueOuter: ;
}
The disadvantage is that I calculate all ToString() values even though it might not be necessary. I might find all matching data objects in the first half of the data collection.
How can I build up the dataToDataStr dictionary (or any other enumerable data structure that lets me retrieve both the object and its only-once-calculated ToString value) lazily?
Here is code (mixed with pseudocode) of what I have in mind:
IDictionary<object, string> dataToDataStr = new Dictionary<object, string>();
object lastProcessedDataOb = null;
foreach (object ob in obs)
{
foreach (object dataOb in dataToDataStr.Keys)
if (ob.ToString() == dataToDataStr[dataOb])
{
... // do something with dataOb
goto ContinueOuter;
}
foreach (object dataOb in data STARTING AFTER lastProcessedDataOb)
// if lastProcessedDataOb == null, start with the first entry of data
{
dataToDataStr.Add(dataOb, dataOb.ToString();
lastProcessedDataOb = dataOb;
if (ob.ToString() == dataToDataStr[dataOb])
{
... // do something with dataOb
goto ContinueOuter;
}
}
throw new Exception("No matching data found.");
ContinueOuter: ;
}
I know it is easy if data was a LinkedList or any collection with indexed access (then I could store a linked list node or an index as lastProcessedDataOb), but it isn't - it is an IEnumerable. Maybe yield return can be used here?