Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fb4 RC1 synchronous replication to localhost hang on disconnect [CORE6497] #6727

Closed
firebird-automations opened this issue Feb 24, 2021 · 15 comments

Comments

@firebird-automations
Copy link
Collaborator

Submitted by: Lucas Schatz (arvanus)

Steps to reproduce:

replication.conf:
database = /db/primary.fdb
{
sync_replica = SYSDBA:*******@localhost:/db/replica.fdb
}

systemctl stop firebird
mkdir /db
chown firebird. /db
rm /db/primary.fdb /db/replica.fdb -f
systemctl start firebird
echo "create database 'localhost:/db/primary.fdb';ALTER DATABASE ENABLE PUBLICATION; ALTER DATABASE INCLUDE ALL TO PUBLICATION;quit;" | /opt/firebird/bin/isql
systemctl stop firebird
cp -a primary.fdb replica.fdb
gfix -replica read_write /db/replica.fdb
systemctl start firebird
echo "show table;quit;" | /opt/firebird/bin/isql localhost:/db/primary.fdb
echo "show table;quit;" | /opt/firebird/bin/isql localhost:/db/replica.fdb
echo "create table tb1 (a integer not null, constraint tb1_pk primary key (a));commit;quit;" | /opt/firebird/bin/isql localhost:/db/primary.fdb #⁠Here the script hang on the quit; command and I need to pkill -9 both isql and firebird
pkill -9 firebird; pkill -9 isql
echo "show table;quit;" | /opt/firebird/bin/isql localhost:/db/primary.fdb #⁠Table is created
echo "show table;quit;" | /opt/firebird/bin/isql localhost:/db/replica.fdb #⁠Table is replicated

No error in replication.log nor firebird.log

Commits: f18079a

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

assignee: Dmitry Yemanov [ dimitr ]

@firebird-automations
Copy link
Collaborator Author

Commented by: Martin Wong (mw140213)

I can reproduce the same issue with the following scenario:

1- Connect to the replica database and keep the connection active.
2- Connect to the master database and make some update statements.
3- Disconnect from the replica database.
4- Disconnect from the master database and here the Firebird server should hang).

I hope that will help to find the issue.

@firebird-automations
Copy link
Collaborator Author

Commented by: @dyemanov

I've committed the fix, please test the next (tomorrow's) snapshot build.

@firebird-automations
Copy link
Collaborator Author

Commented by: Lucas Schatz (arvanus)

Hi @dmitry just downloaded the snapshot Firebird-4.0.0.2372-ReleaseCandidate1.amd64.tar.gz, but it still hang
PS: I'm not 100% sure, but I believe that Fb CI/CD automatically uploads the snapshot, am I correct?

@firebird-automations
Copy link
Collaborator Author

Commented by: @dyemanov

No, snapshots are built nightly and available here: http://web.firebirdsql.org/downloads/snapshot_builds/
The fixed one should have build number 2374.

@firebird-automations
Copy link
Collaborator Author

Commented by: Martin Wong (mw140213)

I already downloaded the today's snapshot and test it, the problem fixed and no more hanging after disconnecting from the master database.
I used this build http://web.firebirdsql.org/download/snapshot_builds/win/4.0/Firebird-4.0.0.2374-0_x64.zip.

@firebird-automations
Copy link
Collaborator Author

Commented by: @dyemanov

Thanks for confirmation.

@firebird-automations
Copy link
Collaborator Author

Commented by: Lucas Schatz (arvanus)

I tested here too, problem fixed
Thanks

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

status: Open [ 1 ] => Resolved [ 5 ]

resolution: Fixed [ 1 ]

Fix Version: 4.0.0 [ 10931 ]

@ittentmf
Copy link

The same issue may be still present when using superclassic mode (tested with snapshot Firebird-4.0.1.2628-0_x64 from 2021-10-12). When using ServerMode = SuperClassic, the replication hangs as soon as I disconnect from master db.

@dyemanov
Copy link
Member

@ittentmf Does it happen when both primary and replica databases are served by the same host (and single FB instance), as described in this ticket? Or you can also reproduce it if replication is set up over the wire?

@ittentmf
Copy link

ittentmf commented Oct 14, 2021

@dyemanov It happens in both scenarious, same host and over the wire. maybe worth mentioning: when replication database is on different host, the issue occurs only if master host uses superclassic mode. if master host uses superserver and replication host uses superclassic, the issue does not occur.
Edit: the way I can reprocude it: open two connections to master database. perform some transactions. close the second one. perform a transaction with the first one. new transactions will not be replicated anymore, no error in replication log. when closing the first connection too and then opening another connection, the next transaction would be replicated again, but transactions from first connection would still be missing in replication db. so the replication seams to continues after all connections to master db have been closed, but updates made in the meantime are not applied to replication.

@dyemanov
Copy link
Member

Confirmed, thanks. Working on a solution.

dyemanov added a commit that referenced this issue Oct 27, 2021
@dyemanov
Copy link
Member

@ittentmf It should be fixed now (build 2646).

@ittentmf
Copy link

@dyemanov looks good. can not reproduce the issue anymore. thanks.

dyemanov added a commit that referenced this issue Nov 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants