Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

another performance degradation in v2.1.2 (at least twice slower than v1.5.5) [CORE2334] #2758

Closed
firebird-automations opened this issue Feb 20, 2009 · 25 comments

Comments

@firebird-automations
Copy link
Collaborator

Submitted by: Paulius Pazera (ppazera)

Attachments:
amdtests.pdf

Votes: 1

test case below shows ~2+ times slowdown comparing 2.1.2rc1 with 1.5.5. To reproduce -- create database, re-connect isql using localhost and execute the script provided below. Note how long 'select * from sp;' takes to insert initial records, and how long 'insert .. select ...' takes to insert additional records. For example, on our dual xeon server classic firebird v1.5.5 takes ~9 seconds to insert initial and ~5.5 seconds to insert additional records while classic v2.1.2rc1 takes ~22 seconds to insert initial and ~13.5 seconds to insert additional records on the same hardware

set stats on;
set plan on;
create table tbl(i1 integer);
create index idx_tbl_i1 on tbl(i1);
commit;

set term ^;
create procedure sp
returns (i integer)
as begin
i=0;
while(i<1000000) do begin
i=i+1;
insert into tbl values (:i/5);
end
suspend;
end
^
set term ;^

select * from sp; /* ~9 sec v1.5.5 vs ~22 sec v2.1.2 -- more than twice slower */
commit;

insert into tbl (i1)
select i1+100000 from tbl where i1<100000; /* ~5 sec v1.5.5 vs ~14 sec v2.1.2 -- more than twice slower */
rollback;

insert into tbl (i1)
select i1+100000 from tbl where i1<100000; /* ~6 sec v1.5.5 vs ~13 sec v2.1.2 -- more than twice slower */
commit;

drop procedure sp;
drop table tbl;
commit;

@firebird-automations
Copy link
Collaborator Author

Commented by: @dyemanov

I don't really think it's a performance degradation of v2.1.2. Supposedly, it covers the entire v2.1 series, maybe also v2.0 series.

@firebird-automations
Copy link
Collaborator Author

Commented by: @dyemanov

Also, could you please re-test v2.1 with DatabaseGrowthIncrement in firebird.conf set to zero (disabled)?

@firebird-automations
Copy link
Collaborator Author

Commented by: Paulius Pazera (ppazera)

DatabaseGrowthIncrement =0 did not make any difference

@firebird-automations
Copy link
Collaborator Author

Commented by: @dyemanov

Once again, this seems being the Linux only issue, as on Windows this test performs equally fast between v1.5.5 and v2.1.2.

@firebird-automations
Copy link
Collaborator Author

Commented by: @asfernandes

Or Intel only issue.

@firebird-automations
Copy link
Collaborator Author

Commented by: @dyemanov

> Or Intel only issue.

I've switched to the Intel platform recently, so it seems unlikely :-)

@firebird-automations
Copy link
Collaborator Author

Commented by: @asfernandes

I come to two conclusion after tests.
1) Binaries distributed by the project runs much slower than one built by me with gcc 4.3.2-1ubuntu12. Paulius, can you build FB yourself and test if it runs faster?
2) There is some room for optimization, but I can't demonstrate gain outside the profile. The area I changed has some constification in BTR/BTN code, to generate better code. The changes are not very clean, so it don't worth invest on it.

@firebird-automations
Copy link
Collaborator Author

Commented by: Paulius Pazera (ppazera)

how long did it take for you to run that test case on 1.5.x, 2.1.2release and 2.1.2ubuntu?

unfortunately I can not compile R2_1_2 using gcc 4.2.1 on SUSE -- 'create_db empty.fdb' gets stuck using 100% CPU (the same happens when trying to run it manually)

@firebird-automations
Copy link
Collaborator Author

Commented by: @asfernandes

This is the timings for 2.1.2-rc1-x64 on AMD Athlon(tm) 64 X2 Dual Core Processor 5200+;

SQL> select * from sp; /* ~9 sec v1.5.5 vs ~22 sec v2.1.2 -- more than twice slower */
Elapsed time= 10.59 sec

SQL> insert into tbl (i1)
CON> select i1+100000 from tbl where i1<100000; /* ~5 sec v1.5.5 vs ~14 sec v2.1.2 -- more than twice slower */
Elapsed time= 5.83 sec

SQL> insert into tbl (i1)
CON> select i1+100000 from tbl where i1<100000; /* ~6 sec v1.5.5 vs ~13 sec v2.1.2 -- more than twice slower */
Elapsed time= 6.42 sec

1.5.5 is a bit faster, and not always.

@firebird-automations
Copy link
Collaborator Author

Commented by: Saulius Vabalas (svabalas)

Adriano, would you run same test case on the same machine, but now using official FB 1.5.5 CS and post results here? That way we can see the real difference on the same hardware platform between those two versions.

Thanks,
Saulius

@firebird-automations
Copy link
Collaborator Author

Commented by: @asfernandes

I did, and like I said, it's a bit (for example, first one was ~10.0) faster. And times above is with official x64 package. Binaries build by me gives similar result.

On the other hand, official i686 package gives this:

SQL> select * from sp; /* ~9 sec v1.5.5 vs ~22 sec v2.1.2 -- more than twice slower */
Elapsed time= 15.40 sec

SQL> insert into tbl (i1)
CON> select i1+100000 from tbl where i1<100000; /* ~5 sec v1.5.5 vs ~14 sec v2.1.2 -- more than twice slower */
Elapsed time= 8.23 sec

SQL> insert into tbl (i1)
CON> select i1+100000 from tbl where i1<100000; /* ~6 sec v1.5.5 vs ~13 sec v2.1.2 -- more than twice slower */
Elapsed time= 10.02 sec

AFAIK, Alex uses fresh compiler for x64 packages. That confirms an optimization problem with old GCC versions and FB 2.x code.

@firebird-automations
Copy link
Collaborator Author

Commented by: @dyemanov

IIRC, Serg Mereutza uses a relatively fresh GCC framework for the i686 snapshot builds, so it's worth downloading and testing one (http://www.dqteam.com/fb2/).

@firebird-automations
Copy link
Collaborator Author

Commented by: Paulius Pazera (ppazera)

the results I provided in original description were obtained using snapshot build with CORE2329 fix (i.e. already taken from /www.dqteam.com/fb2/)

to sum up Adriano results (would be nice to see actual 1.5.5 results):

select * from sp; /* ~9 sec v1.5.5 vs ~22 sec v2.1.2 -- more than twice slower */
10.59 sec official x64
15.40 sec official i686
official 1.5.5 is a bit faster (~10.0)

insert into tbl (i1) select i1+100000 from tbl where i1<100000; /* ~5 & 6 sec v1.5.5 vs ~14 & 13 sec v2.1.2 -- more than twice slower */
5.83 & 6.42 sec official x64
8.23 & 10.02 sec official i686
official 1.5.5 is a bit faster

if you still think that it would be helpful/useful to test freshly compiled firebird2 on our machine then I would need help on how to make 'create_db' work so that 'make' can finish compiling binaries

@firebird-automations
Copy link
Collaborator Author

Commented by: @AlexPeshkoff

> how to make 'create_db' work so that 'make' can finish compiling binaries

very simple - you should upgrade gcc, because 4.2.X is broken
for example 4.3.1 is known to work fine

@firebird-automations
Copy link
Collaborator Author

Commented by: @AlexPeshkoff

My results for this 3 tests are:
1.5.5 2.1.2 RC2 Gcc 4.3.1 x64 (gcc 4.2.4)
12.17 18.46 15.82 11.14
6.21 12.84 10.5 6.61
8.79 11.21 10.15 6.81

(if your browser shows that table bad, amdtests.pdf is attached)

And it's not strange. ODS11 is 64-bit thing, therefore certainly 64-bit build is faster than 32-bit one. FB2.X has to be a bit slower compared with 1.5, cause it has to deal with 64-bit record numbers. BTW, 'official' compiler for x64 is 3.3, not too big difference with 3.2 for x86.

I do not know why is so big difference between VC and gcc builds. It's quite possible, that microsoft optimizer for 32-bit system is better than gcc. High quality of gcc's 64-bit optimizer may be due to:
http://developer.amd.com/CPU/GNU/Pages/default.aspx

Next answer - why is so old compiler used to build packages. I must say, that historically process of firebird build was very complicated and required large set of various tools (not too old - sometimes we had to use snapshots cause releases did not satisfy) to be installed. And this is till today the case for old systems (32-bit builds provide compatibility for RH8 and other that days systems). Therefore it was decided to keep binary support for such linuxes. On modern systems starting with FB 2.1 all you need to do to built with your favorite compiler is ./configure && make && make dist. Yes, there happen to be buggy gcc releases (4.2.X), but for sure it's not firebird project problem.

To summarize - if you care about performance:
1. Build binaries yourself,
2. And GO 64-bit.

@firebird-automations
Copy link
Collaborator Author

Commented by: @AlexPeshkoff

Results of tests on amd64/athlon x2 3800+

@firebird-automations
Copy link
Collaborator Author

Modified by: @AlexPeshkoff

Attachment: amdtests.pdf [ 11382 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: Paulius Pazera (ppazera)

well, it's not that easy to build due to compiler requirements. OpenSuse10.3 we have on that server does not support gcc43. Found rpms in devel repository but they require libstdc++, libgcc, etc upgrades, and I can not do that because it's production server (last time I killed another box while trying to update libraries from development repositories)

I compiled with gcc43 on another box (openSuse11.0, but amd cpu). Then tested those binaries on dual xeon box and results were the same as snapshot build I downloaded from firebird (i.e. at least twice slower than 1.5). Are you saying that doing standard 'make' without additional options/parameters on a box having different CPU will produce different binaries which may run significantly faster? If so, can I specify some options to build intel-friendly binaries on amd box?

Alex's tests show 1.3..2.1 times slowdown comparing official 1.5.5 and 2.1.2rc2. We can not blame old linux distribution where official binaries are built because I assume both 1.5.x and 2.1.x official binaries were built on the same box, right? And 2.1.2rc2 binary built by me on pretty new distribution shows the same results. So we can not blame old libraries either. Maybe default compiler options changed between 1.5 and 2.1?

sorry, I don't believe that such slowdown is because ODS11 became 64 bit and code is still 32 bit, there must be another reason

@firebird-automations
Copy link
Collaborator Author

Commented by: Paulius Pazera (ppazera)

seems like CORE2050 is broken again, i.e. when looking at strace this test case produces 250 semop() calls in v1.5.5 and 10906 (!) semop() calls in 2.1.2rc1snapshot

@firebird-automations
Copy link
Collaborator Author

Commented by: Paulius Pazera (ppazera)

I thought that it might be useful to see whole summary of straced calls (first number is call count for 2.1.2, second -- for 1.5.5):

2 6 access
3 43 brk
1 0 clone
42 19 close
1 1 execve
1 1 exit_group
13 8 fadvise64
1 0 fchmod
1 1 fcntl64
4 4 flock
37 16 fstat64
1 0 ftruncate64
4 0 futex
1 0 getcwd
2 0 getdents64
5 5 getegid32
11 9 geteuid32
5 5 getgid32
0 1 getpid
1 0 getrlimit
482 22 gettimeofday
12 11 getuid32
154 149 ioctl
1 4 kill
58454 68237 _llseek
133 34 mmap2
4 2 mprotect
74 10 munmap
68 20 open
20690 23211 read
162 12 readlink
534 528 rt_sigaction
215 214 rt_sigprocmask
33 1 semctl
2 1 semget
10906 250 semop
2 2 setfsgid32
2 2 setfsuid32
1 0 set_robust_list
1 1 set_thread_area
1 0 set_tid_address
1 0 --- SIGCHLD
37 7 stat64
1 272 time
21 21 times
2 2 umask
2 1 uname
2 1 unlink
39276 45883 write

@firebird-automations
Copy link
Collaborator Author

Commented by: Paulius Pazera (ppazera)

well, it does not look like CORE2050 (2.1.2rc2 is somewhere between 1.5.5 and 2.1.1):

initial insert using SP:
16 sec v1.5.5
50 sec v2.1.1
27 sec vCore2050IntelOptimized
33 sec v2.1.2rc2

first insert/select:
9 sec v1.5.5
28 sec v2.1.1
18 sec vCore2050IntelOptimized
19 sec v2.1.2rc2

second insert/select:
9 sec v1.5.5
30 sec v2.1.1
15 sec vCore2050IntelOptimized
24 sec v2.1.2rc2

anything else I can do?

I still think that such performance degradation is too much for database server, I can not imagine any new feature which could be worth such big price

@firebird-automations
Copy link
Collaborator Author

Commented by: @dyemanov

Paulius, it would be also interesting to add one more measurement to your list: v2.1.2rc2 server against the v1.5 database (i.e. ODS10 one).

@firebird-automations
Copy link
Collaborator Author

Commented by: Paulius Pazera (ppazera)

v2.1.2rc2+ods10 is even worse:

initial insert using SP:
11 sec v1.5.5
27 sec v2.1.2rc2
50 sec v2.1.2rc2+ods10

first/second insert+select:
6/7 sec v1.5.5
16/16 sec v2.1.2rc2
19/19 sec v2.1.2rc2+ods10

also I did a bunch of other changes/tests trying to isolate suspicious area:

original test case: +150..180% (v1.5.5 --> v2.1.2rc2)
without index: +20..50% (better)
with 5 indices: +150..180% (same)
bigger record size (~520): +50..100% (a bit better)
bigger record size (~2k): +15..50% (much better)
local file vs localhost: no difference
disk speed: no difference
more buffers (5000): initial insert is slower in v1.5 (pretty similar)

looks like I accidentally picked worst case scenario for this test case (small record size with index)

I was monitoring various system parameters while running original test case on v1.5.5 and v2.1.2rc2. I noticed two major differences:

'virt' column (when running 'top') shows 103m for v1.5 and only 14064 for v2.1 (saves memory but uses extra CPU cycles somewhere)

'acquire/s' column (when running fb_lock_print -ia) shows 0 for v1.5 (except ~20 when starting & finishing execution of each statement) and constantly ~100 for v2.1 (acquiring something for each record instead per statement)

@firebird-automations
Copy link
Collaborator Author

Commented by: Hans J Haase (hansjhaase)

We're using fb in medical data processing. Our db sizes in different locations of health care are 20GB and more (up to 120 GB). We moved 2 weeks ago from 1.5.6 to 2.1.3 on linux (ubuntu). Server for fb1.5.6CS was Intel Quad with 5 GB, RAID5 now it is a 4x4Quad with 16GB. While using 1.5.6 binaries "quite out of the box" the 2.1.3 CS was tuned due to very bad performance results (shmmax, sem etc). N.B. The client application for electronic health recording didn't change! Semaphores, Cache etc has been tried to adopt to better performance results - the effects are only marginal(!) - The client startup takes 2-6 Minutes(!) if there are more than 40 stations online and using up to 3 connections for each client including medical imaging (BLOBS). On 1.5.6 it takes only 45 sec up to 2 minutes. Remember - the client app didn't changed - There must be a problem on the engine of fb2.x under linux - On windows os fb 2.x is al little bit faster. I fear we have to go back to 1.5.x and then leave fb for the future if there is no one who takes care of this serious problem...

@dyemanov
Copy link
Member

This ticket is very old and covers unsupported versions, thus closed. If performance is still an issue with recent versions, some updated information would be appreciated.

@dyemanov dyemanov closed this as not planned Won't fix, can't repro, duplicate, stale Jan 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants