Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the ability to specify in the isc_dsql_fetch (or in another API) how many records need to be prefetched from the server [CORE6013] #6263

Open
firebird-automations opened this issue Feb 26, 2019 · 3 comments

Comments

@firebird-automations
Copy link
Collaborator

Submitted by: @abzalov

Add the ability to specify in the isc_dsql_fetch (or in another API) how many records need to be prefetched from the server.

This feature will allow to avoid the fragmentation of the on the network, as well as to avoid multiple context switching during server calls to fetch next several records.
As a result, this leads to an increase of the client application performance and a reduction multiple but small calls to the server — the select will be processed in large parts, and there will be no need to return to it many times.

I know that in the current implementation, Firebird independently returns as many records as fit in the packet of server exchange protocol.
But:
- in some cases it may be necessary to more finely manage the number of returned records (based on the specifics of the application)
- this does not solve the problem of network fragmentation
- this makes it impossible for the server to process the request in bulk, rather than returning to it many times
- this does not solve the problem of permanent round-trip to the server to retrieve next records. Instead of one request and one big answer (let it be fragmented in parts, but sent at one time)

A similar feature is available in Oracle and Postgres.
Oracle documentation - https://docs.oracle.com/cd/B28359_01/appdev.111/b28395/oci04sql.htm#i429698 (see OCI_ATTR_PREFETCH_ROWS)
Postgres documentation - https://www.postgresql.org/docs/9.6/sql-fetch.html (see FORWARD count)

Oracle has 2 options:
1) use of OCI_ATTR_PREFETCH_ROWS - the requested number of records is cached in the client library internal buffer (oci.dll).
Then, on the next OCIStmtFetch calls, the records are taken from the internal buffer, and not requested from the server until the buffer is empty.
2) the ability to get an array of records immediately to the client's application buffer (not the client library internal buffer).

Postgres has only 1 option:
1) obtaining an array of records from the server to the client library internal buffer.
Then, the user application can access the row values of the records by index.
If necessary, the application can request the next portion of the array of records that will overwrite the current contents of the buffer.

@firebird-automations
Copy link
Collaborator Author

Commented by: @abzalov

If what is described in this article (https://dyemanov.github.io/records-batching/) is already implemented, then it would be enough to be able to change the value of the TcpRemoteBufferSize parameter for each request using the API, rather than the global setting on the client. It would be nice if the parameter would take the value in the records, not bytes.

For example, using a parameter in isc_dsql_fetch or its equivalent.

As a result, this would solve all needs as well as OCI_ATTR_PREFETCH_ROWS, but better (adaptive modes described in the article).
Quote:
- "But obviously, it wastes a lot of time in the case of slow networks. So Firebird uses the asynchronous batching, also known as pipelining. As soon as all records of the batch are sent to the client, the server starts to fetch new records from the engine and cache them for the next transmission."
- "As soon as the client library has processed some part of the current batch, it asks the server for the next batch and continues processing the remaining records. This allows to distribute the load more evenly and provide a better overall throughput. The current (hardcoded) pipelining threshold is 1/2 of the batch size."

@firebird-automations
Copy link
Collaborator Author

Commented by: @mrotteveel

The network protocol has an option to specify the number of rows to fetch, but it is not surfaced in the native client (the native client will make a guess at a number for you).

Personally, I'm not happy with the decision made in Firebird 3 that the server will limit the number of rows based on the message size, and that it will override a client if it is asking too many records (in the opinion of the server). In previous versions that just worked.

@firebird-automations
Copy link
Collaborator Author

Commented by: @dyemanov

I remember that discussion and IIRC I agreed with you. I just need to find time to revisit that code...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant