Issue Details (XML | Word | Printable)

Key: CORE-6013
Type: Improvement Improvement
Status: Open Open
Priority: Major Major
Assignee: Unassigned
Reporter: Rashid Abzalov
Votes: 0
Watchers: 5
Operations

If you were logged in you would be able to see more operations.
Firebird Core

Add the ability to specify in the isc_dsql_fetch (or in another API) how many records need to be prefetched from the server

Created: 26/Feb/19 08:40 PM   Updated: 01/Mar/19 10:06 AM
Component/s: API / Client Library
Affects Version/s: 4.0 Beta 1
Fix Version/s: None

QA Status: No test


 Description  « Hide
Add the ability to specify in the isc_dsql_fetch (or in another API) how many records need to be prefetched from the server.

This feature will allow to avoid the fragmentation of the on the network, as well as to avoid multiple context switching during server calls to fetch next several records.
As a result, this leads to an increase of the client application performance and a reduction multiple but small calls to the server — the select will be processed in large parts, and there will be no need to return to it many times.

I know that in the current implementation, Firebird independently returns as many records as fit in the packet of server exchange protocol.
But:
  - in some cases it may be necessary to more finely manage the number of returned records (based on the specifics of the application)
  - this does not solve the problem of network fragmentation
  - this makes it impossible for the server to process the request in bulk, rather than returning to it many times
  - this does not solve the problem of permanent round-trip to the server to retrieve next records. Instead of one request and one big answer (let it be fragmented in parts, but sent at one time)

A similar feature is available in Oracle and Postgres.
  Oracle documentation - https://docs.oracle.com/cd/B28359_01/appdev.111/b28395/oci04sql.htm#i429698 (see OCI_ATTR_PREFETCH_ROWS)
  Postgres documentation - https://www.postgresql.org/docs/9.6/sql-fetch.html (see FORWARD count)

Oracle has 2 options:
  1) use of OCI_ATTR_PREFETCH_ROWS - the requested number of records is cached in the client library internal buffer (oci.dll).
    Then, on the next OCIStmtFetch calls, the records are taken from the internal buffer, and not requested from the server until the buffer is empty.
  2) the ability to get an array of records immediately to the client's application buffer (not the client library internal buffer).

Postgres has only 1 option:
  1) obtaining an array of records from the server to the client library internal buffer.
    Then, the user application can access the row values of the records by index.
    If necessary, the application can request the next portion of the array of records that will overwrite the current contents of the buffer.

 All   Comments   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Rashid Abzalov added a comment - 27/Feb/19 05:40 PM - edited
If what is described in this article (https://dyemanov.github.io/records-batching/) is already implemented, then it would be enough to be able to change the value of the TcpRemoteBufferSize parameter for each request using the API, rather than the global setting on the client. It would be nice if the parameter would take the value in the records, not bytes.

For example, using a parameter in isc_dsql_fetch or its equivalent.

As a result, this would solve all needs as well as OCI_ATTR_PREFETCH_ROWS, but better (adaptive modes described in the article).
Quote:
  - "But obviously, it wastes a lot of time in the case of slow networks. So Firebird uses the asynchronous batching, also known as pipelining. As soon as all records of the batch are sent to the client, the server starts to fetch new records from the engine and cache them for the next transmission."
  - "As soon as the client library has processed some part of the current batch, it asks the server for the next batch and continues processing the remaining records. This allows to distribute the load more evenly and provide a better overall throughput. The current (hardcoded) pipelining threshold is 1/2 of the batch size."

Mark Rotteveel added a comment - 28/Feb/19 07:35 PM
The network protocol has an option to specify the number of rows to fetch, but it is not surfaced in the native client (the native client will make a guess at a number for you).

Personally, I'm not happy with the decision made in Firebird 3 that the server will limit the number of rows based on the message size, and that it will override a client if it is asking too many records (in the opinion of the server). In previous versions that just worked.

Dmitry Yemanov added a comment - 01/Mar/19 10:06 AM
I remember that discussion and IIRC I agreed with you. I just need to find time to revisit that code...