Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gradual slowdown compilation (create, recreate or drop) of views [CORE5612] #5878

Closed
firebird-automations opened this issue Sep 14, 2017 · 11 comments

Comments

@firebird-automations
Copy link
Collaborator

Submitted by: @abzalov

Jira_subtask_outward CORE5746

Attachments:
BUILD.sql

With a large number of existing views, the compilation of new views (or drop existing) is very slow and the time increases as the number of views increases.

Time for create 1 view:
- on empty database: 0.2s
- 50 view already exist: 1s
- 100 view already exist: 1.7s
- 200 view already exist: 3.3s

The CPU load (1 core) is 100%, and the time is almost independent of the computer configuration, i.e. that on the server with 2 Xeon processors, 64Gb of memory and hardware RAID, that on the usual desktop.

When creating real databases ~ 1000-1500 view, creating 1 view gradually slows down ~ from 30 sec. up to 1 min.

On real databases ~1000 view, creating 1 view takes ~ from 30 seconds up to 1 min.
Approximate number of records in real databases:
RDB$DEPENDENCIES ~ 1 000 000
RDB$RELATION_FIELDS ~ 300 000
RDB$RELATIONS ~ 3 500

Commits: 2cb9e64

@firebird-automations
Copy link
Collaborator Author

Commented by: @abzalov

Script to generate test views

@firebird-automations
Copy link
Collaborator Author

Modified by: @abzalov

Attachment: BUILD.sql [ 13179 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @abzalov

Do I need to provide any other information?
We tried to recalculate the system indexes after each creation of 50 views and reconnect, but this did not help.
We are very interested in solving this problem.

@firebird-automations
Copy link
Collaborator Author

Commented by: @abzalov

During debugging it turned out that dependencies are removed (MET_delete_dependencies) in MET_get_dependencies and then created (store_dependencies).
But store_dependencies still checks for the presence of dependencies - the branch "if (field_name.hasData ())".

If the check is removed, then the view re-creation time (from the example) is reduced up to 3 times (was 24 sec. became 8 sec. - checked in the debug build).

@firebird-automations
Copy link
Collaborator Author

Commented by: @abzalov

In the above method, which does not verify the existence of dependencies, there is a bug, which creates duplicate dependency records.
But it shows for what is spend a significant amount of time.

There was also one more optimization method, which completely removes the slowdown depending from the number of existed views - creating a unique index in RDB$DEPENDENCIES across all its fields (in 2.5 the field RDB$PACKAGE_NAME does not exist).
CREATE UNIQUE INDEX RDB$DEPENDENCIES_UNQ ON RDB$DEPENDENCIES
(RDB$DEPENDENT_NAME, RDB$DEPENDED_ON_NAME, RDB$FIELD_NAME, RDB$DEPENDENT_TYPE, RDB$DEPENDED_ON_TYPE);
In store_dependencies, the existence of the dependency is checked against these fields (the RDB$PACKAGE_NAME field does not participate in the selection).

After the creation of such an index in 2.5 - creating view does not depend on the number of existing views and always takes ~ 0.2s - 0.3s (tested with 500 existing views created in BUILD.sql).

In my opinion, such an index should be in the standard installation, because it will significantly speed up the create/alter/delete of views and additionally solve the task of controlling the availability of records.

Of course, we would create this index ourselves, but since 3.0 it is forbidden to create an index for system tables:
This operation is not defined for system tables.
unsuccessful metadata update.
CREATE INDEX RDB$DEPENDENCIES_UNQ failed.
no permission for ALTER access to TABLE RDB$DEPENDENCIES.

Why is this prohibited? And what about the solution to the original problem in 3.0?

@firebird-automations
Copy link
Collaborator Author

Commented by: @abzalov

In total, to solve our problem now we do the following:
  1) Create a unique index for all the fields RDB$DEPENDENCIES - much faster checks the existence of dependencies (check_dependencies)
  2) Create the index RDB$DEPENDENCIES (RDB$DEPENDENT_NAME, RDB$DEPENDENT_TYPE) - much faster deleting of dependencies (MET_delete_dependencies)
  3) Before mass creation/re-creation of views, we are disable the system indices (then enable them back) RDB$INDEX_27 and RDB$INDEX_28 - since instead of using them it is better to use 2 indexes created above

Proceeding from the fact that it is necessary to apply adaptive mechanics, and from the fact that in the source code is not used anywhere specifying the plan (perhaps there are reasons for it) - there is hardly a chance that you implement the use of "correct" system indexes in the right places.
Therefore, please remove the restriction on creating indexes for system tables, and disabling/enabling system indexes.
We ourselves will manipulate them at the right time.

In the end, what can be a bad from modification of system indexes, given that DB developers understand what they are doing?
For example, in Oracle there are no restrictions in modifying the system tables - even insert into dual table, not to mention the creation of indexes.

Of course, you can find a way around this restriction (check_gbak_cheating_insupd, check_gbak_cheating_delete), but why can not you provide a regular way?

---

Итого, чтобы сейчас разрешить нашу проблему мы делаем следующее:
1) Создаем уникальный индекс по всем полям RDB$DEPENDENCIES - намного ускоряет проверку существования зависимостей (check_dependencies)
2) Создаем индекс RDB$DEPENDENCIES(RDB$DEPENDENT_NAME, RDB$DEPENDENT_TYPE) - намного ускоряет удаление зависимостей (MET_delete_dependencies)
3) Перед массовым созданием/пересозданием представлений выключаем системные индексы (потом обратно их включаем) RDB$INDEX_27 и RDB$INDEX_28 - т.к. вместо них лучше использовать 2 индекса созданных выше

Исходя из того, что нужно применять адаптивную механику, и из того, что в исходниках нигде не используется указание плана (возможно на это есть свои причины) - вряд ли есть шансы, что вы реализуете использование "правильных" системных индексов в нужных местах.
Поэтому, пожалуйста снимите ограничение по созданию индексов для системных таблиц, и выключению/включению системных индексов.
Мы сами будем манипулировать ими в нужные моменты времени.

В конце концов, что может быть плохого модификации системных индексов, учитывая, что разработчики БД понимают, что они делают?
Например, в Oracle нет никаких ограничений в модификации системных таблиц - хоть вставляй еще 1 строку в таблицу dual, не говоря уже о создании индексов.

Конечно можно найти способ обойти это ограничение (check_gbak_cheating_insupd, check_gbak_cheating_delete), но почему нельзя предоставить штатный способ?

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

assignee: Dmitry Yemanov [ dimitr ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @dyemanov

System indices are adjusted in v4 (ODS13) to cover these cases. In prior versions, additional indices may be created manually.

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

status: Open [ 1 ] => Resolved [ 5 ]

resolution: Fixed [ 1 ]

Fix Version: 4.0 Beta 1 [ 10750 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

status: Resolved [ 5 ] => Resolved [ 5 ]

QA Status: No test => Done successfully

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

status: Resolved [ 5 ] => Closed [ 6 ]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants