Halve the speed of FbDataReader.GetOrdinal() [DNET214] #222

firebird-automations · 2009-02-21T00:17:43Z

Submitted by: Gareth Goslett (gareth)

Attachments:
FbDataReaderTest.cs
FbDataReader.patch

By creating a hashtable on first access to the GetOrdinal method and populating it with the fields and indexes we only need to get the hash code of the requested field name for future lookups.

I did some tests on a table with 10 fields and 1000 records. I read all records 5 times so there were 5 * 1000 * 10 lookups.
My tests were run under NUnit so they are all worse than normal. Results are in milliseconds

CURRENT GetOrdinal
Total ordinal reads: 5000
Result 13
Result 11
Result 11
Result 10
Result 10
Average time 0.011
Total named reads : 5000
Result 33
Result 33
Result 36
Result 32
Average time 0.0268

NEW GetOrdinal
Total ordinal reads: 5000
Result 13
Result 9
Result 10
Result 10
Result 9
Average time 0.0102
Total named reads : 5000
Result 13
Result 16
Result 14
Result 14
Average time 0.0114

Here is the new method:

Hashtable names;
public override int GetOrdinal(string name)
{
this.CheckState();

if \(names == null\)
\{
	names = new Hashtable\(\);
  for \(int i = 0; i < this\.fields\.Count; i\+\+\)
  \{
  	names\.Add\(this\.fields\[i\]\.Alias\.ToLowerInvariant\(\)\.GetHashCode\(\), i\);
  \}
\}

int hash = name\.ToLowerInvariant\(\)\.GetHashCode\(\);

if \(names\.ContainsKey\(hash\) == true\)
\{
	return \(Int32\)names\[hash\];
\}

throw new IndexOutOfRangeException\("Could not find specified column in results\."\);

}

Commits: 902fbea

The text was updated successfully, but these errors were encountered:

firebird-automations · 2009-02-21T00:37:56Z

Modified by: Gareth Goslett (gareth)

description: By creating a hashtable on first access to the GetOrdinal method and populating it with the fields and indexes we only need to get the hash code of the requested field name for future lookups.

I did some tests on a table with 10 fields and 1000 records. I read all records 5 times so there were 5 * 1000 * 10 lookups.

CURRENT GetOrdinal
Total ordinal reads: 5000
Result 13
Result 11
Result 11
Result 10
Result 10
Average time 0.011
Total named reads : 5000
Result 33
Result 33
Result 36
Result 32
Average time 0.0268

NEW GetOrdinal
Total ordinal reads: 5000
Result 13
Result 9
Result 10
Result 10
Result 9
Average time 0.0102
Total named reads : 5000
Result 13
Result 16
Result 14
Result 14
Average time 0.0114

Here is the new method:

Hashtable names;
public override int GetOrdinal(string name)
{
this.CheckState();

if \(names == null\)
\{
	names = new Hashtable\(\);
  for \(int i = 0; i < this\.fields\.Count; i\+\+\)
  \{
  	names\.Add\(this\.fields\[i\]\.Alias\.ToLowerInvariant\(\)\.GetHashCode\(\), i\);
  \}
\}

int hash = name\.ToLowerInvariant\(\)\.GetHashCode\(\);

if \(names\.ContainsKey\(hash\) == true\)
\{
	return \(Int32\)names\[hash\];
\}

throw new IndexOutOfRangeException\("Could not find specified column in results\."\);

}

=>

By creating a hashtable on first access to the GetOrdinal method and populating it with the fields and indexes we only need to get the hash code of the requested field name for future lookups.

I did some tests on a table with 10 fields and 1000 records. I read all records 5 times so there were 5 * 1000 * 10 lookups.
My tests were run under NUnit so they are all worse than normal. Results are in milliseconds

CURRENT GetOrdinal
Total ordinal reads: 5000
Result 13
Result 11
Result 11
Result 10
Result 10
Average time 0.011
Total named reads : 5000
Result 33
Result 33
Result 36
Result 32
Average time 0.0268

NEW GetOrdinal
Total ordinal reads: 5000
Result 13
Result 9
Result 10
Result 10
Result 9
Average time 0.0102
Total named reads : 5000
Result 13
Result 16
Result 14
Result 14
Average time 0.0114

Here is the new method:

Hashtable names;
public override int GetOrdinal(string name)
{
this.CheckState();

if \(names == null\)
\{
	names = new Hashtable\(\);
  for \(int i = 0; i < this\.fields\.Count; i\+\+\)
  \{
  	names\.Add\(this\.fields\[i\]\.Alias\.ToLowerInvariant\(\)\.GetHashCode\(\), i\);
  \}
\}

int hash = name\.ToLowerInvariant\(\)\.GetHashCode\(\);

if \(names\.ContainsKey\(hash\) == true\)
\{
	return \(Int32\)names\[hash\];
\}

throw new IndexOutOfRangeException\("Could not find specified column in results\."\);

}

summary: Double the speed of FbDataReader.GetOrdinal() => Halve the speed of FbDataReader.GetOrdinal()

firebird-automations · 2009-02-21T00:43:19Z

Commented by: Gareth Goslett (gareth)

The unit test which I ran to get the results.
I am creating a generic database access layer and the unit test is specific for that. I am including it here so that you can see if my test was okay.

firebird-automations · 2009-02-21T00:43:19Z

Modified by: Gareth Goslett (gareth)

Attachment: FbDataReaderTest.cs [ 11374 ]

firebird-automations · 2009-02-21T12:02:21Z

Commented by: @cincuranet

I was thinking little bit about this kind of approach. Some ideas to discuss:
* We should generate these hashes probably not on first access for all fields but when this particular field is accessed first. Sometimes you're reading only one or two rows/columns and the overhead for generating the hashtable (espacially for a lot of columns) will be too big.
* I would rather not use Invariant culture, because it may cause colisions (we have these problems already) and maybe we don't have to use Contains method, because if it's not there, it result will be null and that's easy to understand for us, that the column name is wrong.

firebird-automations · 2009-02-21T22:51:36Z

Commented by: Gareth Goslett (gareth)

If we only store the hash at the time of request we will need to lookup the index of the field.
for (int i = 0; i < this.fields.Count; i++)
if (String.Compare(name, this.fields[i].Alias, StringComparison.OrdinalIgnoreCase) == 0)
{
namedIndex.Add(hash, i);
return i;
}
The number of possible string comparisons gets larger with more fields.

I like the generation on first access method but there does not seem to be a performance difference between that method and the cache when needed method.
If someone selects 100 fields but only wants to access the first field, they do not care too much about performance. :)

I changed the hash method to the following:
int hash = name.ToUpper(CultureInfo.CurrentCulture).GetHashCode();
ToUpper() is preferred by FxCop over ToLower(), something about it supporting more precise conversions.
I saw that "CultureInfo.CurrentCulture" is used alot in the client library.

firebird-automations · 2009-02-22T09:24:59Z

Commented by: @cincuranet

Don't start with implementation if still discussing how to do it. It's just worthless.

If you access the field, you have to do the lookup anyway right now, so we will only store the value for future usage (next rows). Pregeneration all info will not be a good for small results. And in fact, when you finish reading first complete row, next rows will be cached and if you stop after, you will not notice any performance gap.

> If someone selects 100 fields but only wants to access the first field, they do not care too much about performance. :)
That's not true. You can get some summary info in first row and then reading the rest of data. Or you're reading some data but on some places you're using only ExecuteScalar.

If you wanna to discuss this more, join list.

firebird-automations · 2009-02-24T19:24:05Z

Commented by: Gareth Goslett (gareth)

Replacement GetOrdinal() method.
I used String.ToUpper() to normalize the user supplied field name, the FxCop explanation is here: "http://msdn.microsoft.com/en-us/library/bb386042.aspx".

firebird-automations · 2009-02-24T19:24:05Z

Modified by: Gareth Goslett (gareth)

Attachment: FbDataReader.patch [ 11380 ]

firebird-automations · 2009-03-03T05:50:02Z

Commented by: @cincuranet

I've done some tests and filling my hashtable takes even for 1000 columns only 2ms (on my machine in VPC). That's IMO acceptable, taking into account, that in more than ~10 rows the performance benefit will be seen. It's implemented little bit differently - generics and only one lookup to see whether the key is there for better performance.

I'm storing directly name of column in uppercase, because when measuring the difference between string and int (+lookup time too) key, the results were same (in measurement error) (on 10000 columns). In fact for lookup for the string key was little bit faster.

firebird-automations · 2009-03-03T05:50:02Z

Modified by: @cincuranet

status: Open [ 1 ] => Resolved [ 5 ]

resolution: Fixed [ 1 ]

Fix Version: 2.5.0 Beta 2 [ 10340 ]

firebird-automations closed this as completed Mar 3, 2009

firebird-automations added fix-version: 2.5.0 Beta 2 resolution: fixed/done priority: trivial component: ado.net provider type: improvement labels Apr 25, 2021

firebird-automations assigned cincuranet Apr 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Halve the speed of FbDataReader.GetOrdinal() [DNET214] #222

Halve the speed of FbDataReader.GetOrdinal() [DNET214] #222

firebird-automations commented Feb 21, 2009

firebird-automations commented Feb 21, 2009

firebird-automations commented Feb 21, 2009

firebird-automations commented Feb 21, 2009

firebird-automations commented Feb 21, 2009

firebird-automations commented Feb 21, 2009

firebird-automations commented Feb 22, 2009

firebird-automations commented Feb 24, 2009

firebird-automations commented Feb 24, 2009

firebird-automations commented Mar 3, 2009

firebird-automations commented Mar 3, 2009

Halve the speed of FbDataReader.GetOrdinal() [DNET214] #222

Halve the speed of FbDataReader.GetOrdinal() [DNET214] #222

Comments

firebird-automations commented Feb 21, 2009

firebird-automations commented Feb 21, 2009

firebird-automations commented Feb 21, 2009

firebird-automations commented Feb 21, 2009

firebird-automations commented Feb 21, 2009

firebird-automations commented Feb 21, 2009

firebird-automations commented Feb 22, 2009

firebird-automations commented Feb 24, 2009

firebird-automations commented Feb 24, 2009

firebird-automations commented Mar 3, 2009

firebird-automations commented Mar 3, 2009