Issue Details (XML | Word | Printable)

Key: CORE-2340
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Vlad Khorsun
Reporter: Vlad Khorsun
Votes: 0
Watchers: 1

If you were logged in you would be able to see more operations.
Firebird Core

Bugcheck 258 (page slot not empty) could occurs under high concurrent load

Created: 24/Feb/09 05:11 AM   Updated: 08/Nov/09 09:55 PM
Component/s: Engine
Affects Version/s: 2.0.0, 1.5.4, 2.0.1, 2.0.2, 2.0.3, 1.5.5, 2.1.0, 2.0.4, 2.5 Alpha 1, 2.1.1, 2.0.5, 2.1.2, 1.5.6
Fix Version/s: 2.5 Beta 1, 2.1.3

Time Tracking:
Not Specified

Planning Status: Unspecified

 Description  « Hide
When some relation is extended by the few attachments simultaneously bugcheck could occurs.
Reproduced with SS but it is independent on engine architecture.
Test executed 20 threads insering data in the same table. Bugcheck happens after few hours of work, so it is rare and hard to reproduce.

 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Vlad Khorsun added a comment - 24/Feb/09 05:29 AM
A little explanation of bug and fix. Look at dpm.epp\extend_relation.
To link newly allocated data page into relation we need to find a free slot on some pointer page.
If there is no free slots we allocate and link new pointer page.

for (pp_sequence = relPages->rel_slot_space;; pp_sequence++) {
if (!(ppage = get_pointer_page(tdbb, relation, relPages, &pp_window,
pp_sequence, LCK_write)))
BUGCHECK(253); /* msg 253 pointer page vanished from extend_relation */
SLONG* slots = ppage->ppg_page;
if (ppage->ppg_header.pag_flags & ppg_eof) {

-- here we allocate new PP
ppage = (pointer_page*) DPM_allocate(tdbb, &new_pp_window);
ppage->ppg_header.pag_type = pag_pointer;
ppage->ppg_header.pag_flags |= ppg_eof;
ppage->ppg_relation = relation->rel_id;
ppage->ppg_sequence = ++pp_sequence;
slot = 0;
CCH_RELEASE(tdbb, &new_pp_window);

-- here we store new PP number at in-memory array of PPs
vcl* vector = relPages->rel_pages =
vcl::newVector(*dbb->dbb_permanent, relPages->rel_pages, pp_sequence + 1);
(*vector)[pp_sequence] = new_pp_window.win_page.getPageNum();
-- after this point concurrent attachments in the same SS process could see new PP
-- but we didn't release sheduler still so they can't

// hvlad: temporary tables don't save their pointer pages in RDB$PAGES
if (relation->rel_id && (relPages->rel_instance_id == 0))
-- here we store new PP number in RDB$PAGES
DPM_pages(tdbb, relation->rel_id, pag_pointer,
(SLONG) pp_sequence, new_pp_window.win_page.getPageNum());
-- after this point any concurrent attachments could see new PP
-- and we of course released sheduler inside DPM_pages
relPages->rel_slot_space = pp_sequence;

ppage = (pointer_page*) pp_window.win_buffer;
CCH_MARK(tdbb, &pp_window);
ppage->ppg_header.pag_flags &= ~ppg_eof;
ppage->ppg_next = new_pp_window.win_page.getPageNum();

-- BUG : at this point slot 0 could be filled by some other attachment

- ppage = (pointer_page*) CCH_HANDOFF(tdbb, &pp_window, new_pp_window.win_page.getPageNum(),
- LCK_write, pag_pointer);
- break;

-- FIX : therefore release PP and go to the loop again, searching for free slot
+ --pp_sequence;
CCH_RELEASE(tdbb, &pp_window);

/* We've found a slot. Stick in the pointer to the data page */

if (ppage->ppg_page[slot])
-- another BUG : pointer page was not released so engine is in bad shape and can't report error correclty
+ CCH_RELEASE(tdbb, &pp_window);
CORRUPT(258); /* msg 258 page slot not empty */

Vlad Khorsun added a comment - 06/Mar/09 08:10 AM
Fix is backported into 2.1.3