Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The performance when doing parametrized builk inserts is poor [DNET532] #512

Closed
firebird-automations opened this issue Nov 21, 2013 · 13 comments

Comments

@firebird-automations
Copy link

Submitted by: Toni Martir (toni_martir)

Votes: 1

The performance when executing a query with named parameters (50 or more) is very poor.
Example:
With current code it takes about 40 seconds to insert 20.000 records.
With small optimizations it take about 5 seconds.

The problem is in FbParameterCollection.IndexOf

I obtained the much better results modifying FbParameterCollection and FbParameter,
but may be this is not the correct way...

With the proposed optimization only FbParameterCollection must be modified and small modification to FbParameter.

  SortedList<string, int\> sortedparams;
  SortedList<string, int\> sortedparamsint;

Constructor
this.sortedparams = new SortedList<string, int>();
this.sortedparamsint = new SortedList<string, int>();

Add
this.parameters.Add(value);
sortedparams.Add(value.ParameterName,this.parameters.Count-1);
sortedparamsint.Add(value.InternalParameterName, this.parameters.Count - 1);
IndexOf
int idx = sortedparamsint.IndexOfKey(parameterName);
if (idx == -1)
{
idx = sortedparams.IndexOfKey(parameterName);
if (idx >= 0)
return sortedparams.Values[idx];
else
return -1;
}
else
return sortedparamsint.Values[idx];
public override void Clear()
{
this.parameters.Clear();
sortedparams.Clear();
sortedparamsint.Clear();
}

To detect parameter changes, in FbParameter:
public override string ParameterName
{
get { return this.parameterName; }
set {
this.parameterName = value;
if (this.parent != null)
this.Parent.RebuildSorted();
}
}

And define RebuildSorted in the FbParameterCollection
internal void RebuildSorted()
{
sortedparams.Clear();
sortedparamsint.Clear();
for (int idx = 0;idx <parameters.Count;idx++)
{
sortedparams.Add(parameters[idx].ParameterName,idx);
sortedparamsint.Add(parameters[idx].InternalParameterName,idx);
}
}

I fixed the IndexOf and added Parameters Clear, now it seems
to work correcly, and the performance increase is about 500%-1000% with a 70 parameters insert query.

Commits: 7915e5f 78f6cf7 ac26ea8 62117a2

@firebird-automations
Copy link
Author

Modified by: Toni Martir (toni_martir)

description: The performance when executing a query with named parameters (50 or more) is very poor.
Example:
With current code it takes about 40 seconds to insert 20.000 records.
With small optimizations it take about 5 seconds.

The problem is in FbParameterCollection.IndexOf

I obtained the much better results modifying FbParameterCollection and FbParameter,
but may be this is not the correct way...

With the proposed optimization only FbParameterCollection must be modified and small modification to FbParameter.

  SortedList<string, int\> sortedparams;
  SortedList<string, int\> sortedparamsint;

Constructor
this.sortedparams = new SortedList<string, int>();
this.sortedparamsint = new SortedList<string, int>();

Add
this.parameters.Add(value);
sortedparams.Add(value.ParameterName,this.parameters.Count-1);
sortedparamsint.Add(value.InternalParameterName, this.parameters.Count - 1);
IndexOf
int idx = sortedparamsint.IndexOfKey(parameterName);
if (idx == -1)
idx = sortedparams.IndexOfKey(parameterName);
return idx;

To detect parameter changes, in FbParameter:
public override string ParameterName
{
get { return this.parameterName; }
set {
this.parameterName = value;
if (this.parent != null)
this.Parent.RebuildSorted();
}
}

And define RebuildSorted in the FbParameterCollection
internal void RebuildSorted()
{
sortedparams.Clear();
sortedparamsint.Clear();
for (int idx = 0;idx <parameters.Count;idx++)
{
sortedparams.Add(parameters[idx].ParameterName,idx);
sortedparamsint.Add(parameters[idx].InternalParameterName,idx);
}
}

=>

The performance when executing a query with named parameters (50 or more) is very poor.
Example:
With current code it takes about 40 seconds to insert 20.000 records.
With small optimizations it take about 5 seconds.

The problem is in FbParameterCollection.IndexOf

I obtained the much better results modifying FbParameterCollection and FbParameter,
but may be this is not the correct way...

With the proposed optimization only FbParameterCollection must be modified and small modification to FbParameter.

  SortedList<string, int\> sortedparams;
  SortedList<string, int\> sortedparamsint;

Constructor
this.sortedparams = new SortedList<string, int>();
this.sortedparamsint = new SortedList<string, int>();

Add
this.parameters.Add(value);
sortedparams.Add(value.ParameterName,this.parameters.Count-1);
sortedparamsint.Add(value.InternalParameterName, this.parameters.Count - 1);
IndexOf
int idx = sortedparamsint.IndexOfKey(parameterName);
if (idx == -1)
{
idx = sortedparams.IndexOfKey(parameterName);
return sortedparams.Values[idx];
}
else
return sortedparamsint.Values[idx];
public override void Clear()
{
this.parameters.Clear();
sortedparams.Clear();
sortedparamsint.Clear();
}

To detect parameter changes, in FbParameter:
public override string ParameterName
{
get { return this.parameterName; }
set {
this.parameterName = value;
if (this.parent != null)
this.Parent.RebuildSorted();
}
}

And define RebuildSorted in the FbParameterCollection
internal void RebuildSorted()
{
sortedparams.Clear();
sortedparamsint.Clear();
for (int idx = 0;idx <parameters.Count;idx++)
{
sortedparams.Add(parameters[idx].ParameterName,idx);
sortedparamsint.Add(parameters[idx].InternalParameterName,idx);
}
}

I fixed the IndexOf and added Parameters Clear, now it seems
to work correcly, and the performance increase is about 500%-1000% with a 70 parameters insert query.

@firebird-automations
Copy link
Author

Modified by: Toni Martir (toni_martir)

description: The performance when executing a query with named parameters (50 or more) is very poor.
Example:
With current code it takes about 40 seconds to insert 20.000 records.
With small optimizations it take about 5 seconds.

The problem is in FbParameterCollection.IndexOf

I obtained the much better results modifying FbParameterCollection and FbParameter,
but may be this is not the correct way...

With the proposed optimization only FbParameterCollection must be modified and small modification to FbParameter.

  SortedList<string, int\> sortedparams;
  SortedList<string, int\> sortedparamsint;

Constructor
this.sortedparams = new SortedList<string, int>();
this.sortedparamsint = new SortedList<string, int>();

Add
this.parameters.Add(value);
sortedparams.Add(value.ParameterName,this.parameters.Count-1);
sortedparamsint.Add(value.InternalParameterName, this.parameters.Count - 1);
IndexOf
int idx = sortedparamsint.IndexOfKey(parameterName);
if (idx == -1)
{
idx = sortedparams.IndexOfKey(parameterName);
return sortedparams.Values[idx];
}
else
return sortedparamsint.Values[idx];
public override void Clear()
{
this.parameters.Clear();
sortedparams.Clear();
sortedparamsint.Clear();
}

To detect parameter changes, in FbParameter:
public override string ParameterName
{
get { return this.parameterName; }
set {
this.parameterName = value;
if (this.parent != null)
this.Parent.RebuildSorted();
}
}

And define RebuildSorted in the FbParameterCollection
internal void RebuildSorted()
{
sortedparams.Clear();
sortedparamsint.Clear();
for (int idx = 0;idx <parameters.Count;idx++)
{
sortedparams.Add(parameters[idx].ParameterName,idx);
sortedparamsint.Add(parameters[idx].InternalParameterName,idx);
}
}

I fixed the IndexOf and added Parameters Clear, now it seems
to work correcly, and the performance increase is about 500%-1000% with a 70 parameters insert query.

=>

The performance when executing a query with named parameters (50 or more) is very poor.
Example:
With current code it takes about 40 seconds to insert 20.000 records.
With small optimizations it take about 5 seconds.

The problem is in FbParameterCollection.IndexOf

I obtained the much better results modifying FbParameterCollection and FbParameter,
but may be this is not the correct way...

With the proposed optimization only FbParameterCollection must be modified and small modification to FbParameter.

  SortedList<string, int\> sortedparams;
  SortedList<string, int\> sortedparamsint;

Constructor
this.sortedparams = new SortedList<string, int>();
this.sortedparamsint = new SortedList<string, int>();

Add
this.parameters.Add(value);
sortedparams.Add(value.ParameterName,this.parameters.Count-1);
sortedparamsint.Add(value.InternalParameterName, this.parameters.Count - 1);
IndexOf
int idx = sortedparamsint.IndexOfKey(parameterName);
if (idx == -1)
{
idx = sortedparams.IndexOfKey(parameterName);
if (idx >= 0)
return sortedparams.Values[idx];
else
return -1;
}
else
return sortedparamsint.Values[idx];
public override void Clear()
{
this.parameters.Clear();
sortedparams.Clear();
sortedparamsint.Clear();
}

To detect parameter changes, in FbParameter:
public override string ParameterName
{
get { return this.parameterName; }
set {
this.parameterName = value;
if (this.parent != null)
this.Parent.RebuildSorted();
}
}

And define RebuildSorted in the FbParameterCollection
internal void RebuildSorted()
{
sortedparams.Clear();
sortedparamsint.Clear();
for (int idx = 0;idx <parameters.Count;idx++)
{
sortedparams.Add(parameters[idx].ParameterName,idx);
sortedparamsint.Add(parameters[idx].InternalParameterName,idx);
}
}

I fixed the IndexOf and added Parameters Clear, now it seems
to work correcly, and the performance increase is about 500%-1000% with a 70 parameters insert query.

@firebird-automations
Copy link
Author

Commented by: @cincuranet

Can you check with latest version? There has been some improvements in parameters handling.

@firebird-automations
Copy link
Author

Modified by: @cincuranet

status: Open [ 1 ] => In Progress [ 3 ]

@firebird-automations
Copy link
Author

Commented by: tonim (tonim)

I'm not sure the performance is better than before, but it is still 5 times slower than the older verision I modified.

Inserting 10.000 records with 100 parameters takes 40 seconds.

The main problem is not solved, the problem is getting the index by parameter name. There is a list and
a sorted list should be used, at least in other providers I tested (sqlclient,sqlite) this problem does not exists.

Using a list:
100*50=5000 average string comparisons for each command

Using a sorted list:
100*log (100) = 500 average string comparisons for each command

With the older modified version of Firebird Client I'm using, the same process takes 8 seconds.

About accesiing by index:

The new version have been enhanced a bit. Older version I have takes 3.8 seconds to process 10.000 records while the new version takes 3.2 seconds.

Note that there is a big difference with the TRACE constant defined in compiler options, so any release should have this constant disabled, I downloaded the source, and I unchecked them from Release_XX configurations because the bad performance. more 300% penalty: inserting by index takes 15 seconds with the constant defined.

I will provide you the source code of a Windows.Forms project I created to test it, it creates the database, the table, and then you can test inserts by name or by index. I will provide you a link to download it.

@firebird-automations
Copy link
Author

Commented by: tonim (tonim)

This is a link where you can download the test application with source code:
https://www.dropbox.com/s/m2znc8bcyafxq6j/paramtest.7z?dl=0

@firebird-automations
Copy link
Author

Commented by: @cincuranet

I'm looking at SqlParameterCollection and they are using plain list as we do. I don't see them doing any magic.

Because it's called FbParameter***Collection*** nobody expects the lookup based on name to be fast. If you need every piece of performance, the index is the king. Even with the SortedList you proposed it's still ~10-15% slower.

So there's a clear benefit, no doubt, but only in few cases and the available solution outperforms it anyway. :\ Not entirely maintaining another collection is that good idea.

@firebird-automations
Copy link
Author

Commented by: Alexander Muylaert-Gelein (gonline)

Imho... every piece of performance gained is worth the effort. I agree that this should not be plan A and I wouldn't accept breaking changes for it, but a gain is a gain.

@firebird-automations
Copy link
Author

Commented by: @cincuranet

I don't agree fully. Performance is important for sure. But me this looks like band aid for wrong programming pattern. It's like choosing an array vs hashtable. Especially as this change will bring benefit only in a case with lot of parameters and a loop. On the other hand with few, say around 10, parameters this will have slight (very) performance hit.

@firebird-automations
Copy link
Author

Modified by: @cincuranet

status: In Progress [ 3 ] => Resolved [ 5 ]

resolution: Fixed [ 1 ]

Fix Version: vNext [ 10705 ]

@firebird-automations
Copy link
Author

Commented by: tonim (tonim)

As you can see here:
https://code.google.com/p/csharp-sqlite/source/browse/Community.CsharpSqlite.SQLiteClient/src/SqliteParameterCollection.cs?r=6b8f8a2aeec5ad97a5c032c7d6890484a2fe4356

Parameter named hash table is used to Access parameters by name. I supose sqlclient implementaion have the bottleneck in other location.

Performance about accesiing parameters is somewhat important, but is very important when you reach a bottleneck that is slowing your application about 500%. I have already a patched source because receiving 500 orders and inserting them into the database can be done in seconds instead of minutes.

Also take care that when executing the same ineficient code in other architechtures (Android-ARM) the performance diference is much greater, I experienced this but I don't know why.

@firebird-automations
Copy link
Author

Commented by: @cincuranet

You can (and probably should) use index in that "batch" case. It will be faster no matter what. Anyway. The hot spots have been identified and improved, as you can see in code.

@firebird-automations
Copy link
Author

Commented by: tonim (tonim)

>Because it's called FbParameter***Collection*** nobody expects the lookup based on name to be fast

It was a surprise for me to find a sequential list when accessing parameters by name, because other parts of FirebirdClient are really efficient.

>But me this looks like band aid for wrong programming pattern. It's like choosing an array vs hashtable. Especially as this change will bring benefit only in a case with lot of parameters and a loop.

I really disagree with this.

Programing insertions in a database from a client application is just doing this: lot of parameters in a loop. And this operations are done very often.
You know the advantages of using parameter names instead of indexes, we are not talking about 10% performance hit (index vs parameter).

And the good pattern here is to use a sortedlist,dictionary or hashtable or whatever have fast access:
- Few inserts and even less modifications. You usually define the parameters once.
- Lot of acceses (number_of_columns*number_of_rows_to_insert) .Often you insert more than one row.
- Unknown number of parameters, easilly more than 20.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants