diff options
| author | Kim van der Riet <kpvdr@apache.org> | 2014-03-18 13:54:46 +0000 |
|---|---|---|
| committer | Kim van der Riet <kpvdr@apache.org> | 2014-03-18 13:54:46 +0000 |
| commit | 1d8697cfcfa2d292d5b303797a0f1266cd3bb1d7 (patch) | |
| tree | c92d248684b6607d4deb370edf396c543e00022d /qpid/cpp/src | |
| parent | eba5294974fb2a73b4e765c74196ba4a63079f03 (diff) | |
| download | qpid-python-1d8697cfcfa2d292d5b303797a0f1266cd3bb1d7.tar.gz | |
QPID-5362: No store tools exist for examining the journals - Bugfix and reorganization of qls python modules.
git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1578899 13f79535-47bb-0310-9956-ffa450edef68
Diffstat (limited to 'qpid/cpp/src')
| -rw-r--r-- | qpid/cpp/src/qpid/linearstore/ISSUES | 30 | ||||
| -rw-r--r-- | qpid/cpp/src/qpid/linearstore/journal/jdir.cpp | 152 | ||||
| -rw-r--r-- | qpid/cpp/src/qpid/linearstore/journal/jdir.h | 2 | ||||
| -rwxr-xr-x | qpid/cpp/src/tests/linearstore/linearstoredirsetup.sh | 51 | ||||
| -rwxr-xr-x | qpid/cpp/src/tests/linearstore/tx-test-soak.sh | 31 |
5 files changed, 142 insertions, 124 deletions
diff --git a/qpid/cpp/src/qpid/linearstore/ISSUES b/qpid/cpp/src/qpid/linearstore/ISSUES index a9908e882e..ccadefc20c 100644 --- a/qpid/cpp/src/qpid/linearstore/ISSUES +++ b/qpid/cpp/src/qpid/linearstore/ISSUES @@ -47,8 +47,6 @@ Current/pending: svn r.1558592 2014-01-15 fixes an issue with using /dev/random as a source of random numbers for Journal serial numbers. svn r.1558913 2014-01-16 replaces use of /dev/urandom with several calls to rand() to construct a 64-bit random number. * Recommend rebuilding and testing for performance again with these two fixes. Marked POST. -# - 1036026 [LinearStore] Qpid linear store unable to create durable queue - framing-error: Queue <q-name>: create() failed: jexception 0x0000 - UNABLE TO REPRODUCE - but Frantizek has additional info - 1039522 Qpid crashes while recovering from linear store around apid::linearstore::journal::JournalFile::getFqFileName() including enq_rec::decode() threw JERR_JREC_BAD_RECTAIL * Possible dup of 1039525 * May be fixed by QPID-5483 - waiting for needinfo, recommend rebuilding with QPID-5483 fix and re-testing. Marked POST. @@ -56,18 +54,6 @@ Current/pending: * Possible dup of 1039522 * May be fixed by QPID-5483 - waiting for needinfo, recommend rebuilding with QPID-5483 fix and re-testing. Marked POST. # - 1049870 [LinearStore] auto-delete property does not survive restart -# 5480 1053749 [linearstore] Recovery of store failure with "JERR_MAP_NOTFOUND: Key not found in map." error message - svn r.1564877 2014-02-05: Proposed fix - * Probability: 6 of 600 (1.0%) using tx-test-soak.sh - * If broker is started a second time after failure, it starts correctly and test completes ok. - * Problem: File is being recycled to EFP with still-locked enqueues in it (ie dequeued transactionally). - * Problem: Record alignment check writes filler records to wrong file when decoding bad record moves across a file boundary - * Test of fix failed on RHEL-7 -# - 1064181 [linearstore] Qpidd closes transactional client session&connection with async_dequeue() failed - * jexception 0x010b LinearFileController::getCurrentSerial() threw JERR_NULL -# - 1064230 [linearstore] Qpidd linearstore recovery sometimes fail to recover messages with recoverMessages() failed - * jexception 0x0701 RecoveryManager::readNextRemainingRecord() threw JERR_JREC_BADRECTAIL - * possible dup of 1063700 Fixed/closed (in commit order): =============================== @@ -104,9 +90,24 @@ NO-JIRA - Added missing Apache copyright/license text 5479 1053701 [linearstore] Using recovered store results in "JERR_JNLF_FILEOFFSOVFL: Attempted to increase submitted offset past file size. (JournalFile::submittedDblkCount)" error message * Probability: 2 of 600 (0.3%) using tx-test-soak.sh * Fixed by checkin for QPID-5480, no longer able to reproduce. VERIFIED + 5480 1053749 [linearstore] Recovery of store failure with "JERR_MAP_NOTFOUND: Key not found in map." error message + svn r.1564877 2014-02-05: Proposed fix + * Probability: 6 of 600 (1.0%) using tx-test-soak.sh + * If broker is started a second time after failure, it starts correctly and test completes ok. + * Problem: File is being recycled to EFP with still-locked enqueues in it (ie dequeued transactionally). + * Problem: Record alignment check writes filler records to wrong file when decoding bad record moves across a file boundary 5603 1063700 [linearstore] broker restart fails under stress test svn r.1574513 2014-03-05: Proposed fix. POST * jexception 0x0701 RecoveryManager::readNextRemainingRecord() threw JERR_JREC_BADRECTAIL + 5607 1064181 [linearstore] Qpidd closes transactional client session&connection with async_dequeue() failed + svn r.1575009 2014-03-06 Proposed fix. POST + * jexception 0x010b LinearFileController::getCurrentSerial() threw JERR_NULL + - 1064230 [linearstore] Qpidd linearstore recovery sometimes fail to recover messages with recoverMessages() failed + * jexception 0x0701 RecoveryManager::readNextRemainingRecord() threw JERR_JREC_BADRECTAIL + * possible dup of 1063700 + - 1036026 [LinearStore] Qpid linear store unable to create durable queue - framing-error: Queue <q-name>: create() failed: jexception 0x0000 + * UNABLE TO REPRODUCE - but Frantizek has additional info + * Retested after checkin 1575009, problem solved. VERIFIED Ordered checkin list: ===================== @@ -135,6 +136,7 @@ no. svn r Q-JIRA RHBZ Date 19. 1564893 5361 - 2014-02-05 20. 1564935 5361 - 2014-02-05 21. 1574513 5603 1063700 2014-03-05 +22. 1575009 5607 1064181 2014-03-06 See above sections for details on these checkins. diff --git a/qpid/cpp/src/qpid/linearstore/journal/jdir.cpp b/qpid/cpp/src/qpid/linearstore/journal/jdir.cpp index 896f44ceff..36f180c21f 100644 --- a/qpid/cpp/src/qpid/linearstore/journal/jdir.cpp +++ b/qpid/cpp/src/qpid/linearstore/journal/jdir.cpp @@ -101,17 +101,9 @@ jdir::clear_dir(const std::string& dirname/*, const std::string& */ , const bool create_flag) { - DIR* dir = ::opendir(dirname.c_str()); - if (!dir) - { - if (errno == 2 && create_flag) // ENOENT (No such file or dir) - { - create_dir(dirname); - return; - } - std::ostringstream oss; - oss << "dir=\"" << dirname << "\"" << FORMAT_SYSERR(errno); - throw jexception(jerrno::JERR_JDIR_OPENDIR, oss.str(), "jdir", "clear_dir"); + DIR* dir = open_dir(dirname, "clear_dir", true); + if (!dir && create_flag) { + create_dir(dirname); } //#ifndef RHM_JOWRITE struct dirent* entry; @@ -161,13 +153,7 @@ jdir::push_down(const std::string& dirname, const std::string& target_dir/*, con { std::string bak_dir_name = create_bak_dir(dirname/*, bak_dir_base*/); - DIR* dir = ::opendir(dirname.c_str()); - if (!dir) - { - std::ostringstream oss; - oss << "dir=\"" << dirname << "\"" << FORMAT_SYSERR(errno); - throw jexception(jerrno::JERR_JDIR_OPENDIR, oss.str(), "jdir", "push_down"); - } + DIR* dir = open_dir(dirname, "push_down", false); // Copy contents of targetDirName into bak dir struct dirent* entry; while ((entry = ::readdir(dir)) != 0) @@ -251,60 +237,49 @@ jdir::delete_dir(const std::string& dirname, bool children_only) { struct dirent* entry; struct stat s; - DIR* dir = ::opendir(dirname.c_str()); - if (!dir) - { - if (errno == ENOENT) // dir does not exist. - return; - - std::ostringstream oss; - oss << "dir=\"" << dirname << "\"" << FORMAT_SYSERR(errno); - throw jexception(jerrno::JERR_JDIR_OPENDIR, oss.str(), "jdir", "delete_dir"); - } - else + DIR* dir = open_dir(dirname, "delete_dir", true); // true = allow dir does not exist, return 0 + if (!dir) return; + while ((entry = ::readdir(dir)) != 0) { - while ((entry = ::readdir(dir)) != 0) + // Ignore . and .. + if (std::strcmp(entry->d_name, ".") != 0 && std::strcmp(entry->d_name, "..") != 0) { - // Ignore . and .. - if (std::strcmp(entry->d_name, ".") != 0 && std::strcmp(entry->d_name, "..") != 0) + std::string full_name(dirname + "/" + entry->d_name); + if (::lstat(full_name.c_str(), &s)) { - std::string full_name(dirname + "/" + entry->d_name); - if (::lstat(full_name.c_str(), &s)) - { - ::closedir(dir); - std::ostringstream oss; - oss << "stat: file=\"" << full_name << "\"" << FORMAT_SYSERR(errno); - throw jexception(jerrno::JERR_JDIR_STAT, oss.str(), "jdir", "delete_dir"); - } - if (S_ISREG(s.st_mode) || S_ISLNK(s.st_mode)) // This is a file or slink - { - if(::unlink(full_name.c_str())) - { - ::closedir(dir); - std::ostringstream oss; - oss << "unlink: file=\"" << entry->d_name << "\"" << FORMAT_SYSERR(errno); - throw jexception(jerrno::JERR_JDIR_UNLINK, oss.str(), "jdir", "delete_dir"); - } - } - else if (S_ISDIR(s.st_mode)) // This is a dir - { - delete_dir(full_name); - } - else // all other types, throw up! + ::closedir(dir); + std::ostringstream oss; + oss << "stat: file=\"" << full_name << "\"" << FORMAT_SYSERR(errno); + throw jexception(jerrno::JERR_JDIR_STAT, oss.str(), "jdir", "delete_dir"); + } + if (S_ISREG(s.st_mode) || S_ISLNK(s.st_mode)) // This is a file or slink + { + if(::unlink(full_name.c_str())) { ::closedir(dir); std::ostringstream oss; - oss << "file=\"" << entry->d_name << "\" is not a dir, file or slink."; - oss << " (mode=0x" << std::hex << s.st_mode << std::dec << ")"; - throw jexception(jerrno::JERR_JDIR_BADFTYPE, oss.str(), "jdir", "delete_dir"); + oss << "unlink: file=\"" << entry->d_name << "\"" << FORMAT_SYSERR(errno); + throw jexception(jerrno::JERR_JDIR_UNLINK, oss.str(), "jdir", "delete_dir"); } } + else if (S_ISDIR(s.st_mode)) // This is a dir + { + delete_dir(full_name); + } + else // all other types, throw up! + { + ::closedir(dir); + std::ostringstream oss; + oss << "file=\"" << entry->d_name << "\" is not a dir, file or slink."; + oss << " (mode=0x" << std::hex << s.st_mode << std::dec << ")"; + throw jexception(jerrno::JERR_JDIR_BADFTYPE, oss.str(), "jdir", "delete_dir"); + } } + } // FIXME: Find out why this fails with false alarms/errors from time to time... // While commented out, there is no error capture from reading dir entries. // check_err(errno, dir, dirname, "delete_dir"); - } // Now dir is empty, close and delete it close_dir(dir, dirname, "delete_dir"); @@ -321,14 +296,8 @@ jdir::delete_dir(const std::string& dirname, bool children_only) std::string jdir::create_bak_dir(const std::string& dirname) { - DIR* dir = ::opendir(dirname.c_str()); + DIR* dir = open_dir(dirname, "create_bak_dir", false); long dir_num = 0L; - if (!dir) - { - std::ostringstream oss; - oss << "dir=\"" << dirname << "\"" << FORMAT_SYSERR(errno); - throw jexception(jerrno::JERR_JDIR_OPENDIR, oss.str(), "jdir", "create_bak_dir"); - } struct dirent* entry; while ((entry = ::readdir(dir)) != 0) { @@ -407,25 +376,23 @@ void jdir::read_dir(const std::string& name, std::vector<std::string>& dir_list, const bool incl_dirs, const bool incl_files, const bool incl_links, const bool return_fqfn) { struct stat s; if (is_dir(name)) { - DIR* dir = ::opendir(name.c_str()); - if (dir != 0) { - struct dirent* entry; - while ((entry = ::readdir(dir)) != 0) { - if (std::strcmp(entry->d_name, ".") != 0 && std::strcmp(entry->d_name, "..") != 0) { // Ignore . and .. - std::string full_name(name + "/" + entry->d_name); - if (::stat(full_name.c_str(), &s)) - { - ::closedir(dir); - std::ostringstream oss; - oss << "stat: file=\"" << full_name << "\"" << FORMAT_SYSERR(errno); - throw jexception(jerrno::JERR_JDIR_STAT, oss.str(), "jdir", "delete_dir"); - } - if ((S_ISREG(s.st_mode) && incl_files) || (S_ISDIR(s.st_mode) && incl_dirs) || (S_ISLNK(s.st_mode) && incl_links)) { - if (return_fqfn) { - dir_list.push_back(name + "/" + entry->d_name); - } else { - dir_list.push_back(entry->d_name); - } + DIR* dir = open_dir(name, "read_dir", false); + struct dirent* entry; + while ((entry = ::readdir(dir)) != 0) { + if (std::strcmp(entry->d_name, ".") != 0 && std::strcmp(entry->d_name, "..") != 0) { // Ignore . and .. + std::string full_name(name + "/" + entry->d_name); + if (::stat(full_name.c_str(), &s)) + { + ::closedir(dir); + std::ostringstream oss; + oss << "stat: file=\"" << full_name << "\"" << FORMAT_SYSERR(errno); + throw jexception(jerrno::JERR_JDIR_STAT, oss.str(), "jdir", "delete_dir"); + } + if ((S_ISREG(s.st_mode) && incl_files) || (S_ISDIR(s.st_mode) && incl_dirs) || (S_ISLNK(s.st_mode) && incl_links)) { + if (return_fqfn) { + dir_list.push_back(name + "/" + entry->d_name); + } else { + dir_list.push_back(entry->d_name); } } } @@ -457,6 +424,21 @@ jdir::close_dir(DIR* dir, const std::string& dir_name, const std::string& fn_nam } } +DIR* +jdir::open_dir(const std::string& dir_name, const std::string& fn_name, const bool test_enoent) +{ + DIR* dir = ::opendir(dir_name.c_str()); + if (!dir) { + if (test_enoent && errno == ENOENT) { + return 0; + } + std::ostringstream oss; + oss << "dir=\"" << dir_name << "\"" << FORMAT_SYSERR(errno); + throw jexception(jerrno::JERR_JDIR_OPENDIR, oss.str(), "jdir", fn_name); + } + return dir; +} + std::ostream& operator<<(std::ostream& os, const jdir& jdir) { diff --git a/qpid/cpp/src/qpid/linearstore/journal/jdir.h b/qpid/cpp/src/qpid/linearstore/journal/jdir.h index 86b16f8545..59f21ce499 100644 --- a/qpid/cpp/src/qpid/linearstore/journal/jdir.h +++ b/qpid/cpp/src/qpid/linearstore/journal/jdir.h @@ -353,6 +353,8 @@ namespace journal { * \exception jerrno::JERR_JDIR_CLOSEDIR The directory handle could not be closed. */ static void close_dir(DIR* dir, const std::string& dir_name, const std::string& fn_name); + + static DIR* open_dir(const std::string& dir_name, const std::string& fn_name, const bool test_enoent); }; }}} diff --git a/qpid/cpp/src/tests/linearstore/linearstoredirsetup.sh b/qpid/cpp/src/tests/linearstore/linearstoredirsetup.sh index 3cad50b1c5..ef39767e9b 100755 --- a/qpid/cpp/src/tests/linearstore/linearstoredirsetup.sh +++ b/qpid/cpp/src/tests/linearstore/linearstoredirsetup.sh @@ -19,26 +19,37 @@ # under the License. # - -STORE_DIR=/tmp -LINEARSTOREDIR=~/RedHat/linearstore - -rm -rf $STORE_DIR/qls -rm -rf $STORE_DIR/p002 -rm $STORE_DIR/p004 - -mkdir $STORE_DIR/qls -mkdir $STORE_DIR/p002 -touch $STORE_DIR/p004 -mkdir $STORE_DIR/qls/p001 -touch $STORE_DIR/qls/p003 -ln -s $STORE_DIR/p002 $STORE_DIR/qls/p002 -ln -s $STORE_DIR/p004 $STORE_DIR/qls/p004 - -${LINEARSTOREDIR}/tools/src/py/linearstore/efptool.py $STORE_DIR/qls/ -a -p 1 -s 2048 -n 25 -${LINEARSTOREDIR}/tools/src/py/linearstore/efptool.py $STORE_DIR/qls/ -a -p 1 -s 512 -n 25 -${LINEARSTOREDIR}/tools/src/py/linearstore/efptool.py $STORE_DIR/qls/ -a -p 2 -s 2048 -n 25 - +# This script sets up a test directory which contains both +# recoverable and non-recoverable files and directories for +# the empty file pool (EFP). + +# NOTE: The following is based on typical development tree paths, not installed paths + +BASE_DIR=${HOME}/RedHat +STORE_DIR=${BASE_DIR} +PYTHON_TOOLS_DIR=${BASE_DIR}/qpid/tools/src/linearstore +export PYTHONPATH=${BASE_DIR}/qpid/python:${BASE_DIR}/qpid/extras/qmf/src/py:${BASE_DIR}/qpid/tools/src/py + +# Remove old dirs (if present) +rm -rf ${STORE_DIR}/qls +rm -rf ${STORE_DIR}/p002 +rm ${STORE_DIR}/p004 + +# Create new dir tree and links +mkdir ${STORE_DIR}/p002_ext +touch ${STORE_DIR}/p004_ext +mkdir ${STORE_DIR}/qls +mkdir ${STORE_DIR}/qls/p001 +touch ${STORE_DIR}/qls/p003 +ln -s ${STORE_DIR}/p002_ext ${STORE_DIR}/qls/p002 +ln -s ${STORE_DIR}/p004_ext ${STORE_DIR}/qls/p004 + +# Populate efp dirs with empty files +${PYTHON_TOOLS_DIR}/efptool.py $STORE_DIR/qls/ -a -p 1 -s 2048 -n 25 +${PYTHON_TOOLS_DIR}/efptool.py $STORE_DIR/qls/ -a -p 1 -s 512 -n 25 +${PYTHON_TOOLS_DIR}/efptool.py $STORE_DIR/qls/ -a -p 2 -s 2048 -n 25 + +# Show the result for information ${LINEARSTOREDIR}/tools/src/py/linearstore/efptool.py $STORE_DIR/qls/ -l tree -la $STORE_DIR/qls diff --git a/qpid/cpp/src/tests/linearstore/tx-test-soak.sh b/qpid/cpp/src/tests/linearstore/tx-test-soak.sh index fa05e0a4a8..7d5581961f 100755 --- a/qpid/cpp/src/tests/linearstore/tx-test-soak.sh +++ b/qpid/cpp/src/tests/linearstore/tx-test-soak.sh @@ -19,7 +19,6 @@ # under the License. # - # tx-test-soak # # Basic test methodology: @@ -30,6 +29,8 @@ # 5. Run qpid-txtest against broker in check mode, which checks that all expected messages are present. # 6. Wash, rinse, repeat... The number of runs is determined by ${NUM_RUNS} +# NOTE: The following is based on typical development tree paths, not installed paths + NUM_RUNS=1000 BASE_DIR=${HOME}/RedHat CMAKE_BUILD_DIR=${BASE_DIR}/q.cm @@ -43,13 +44,18 @@ BROKER_MANAGEMENT="no" # "no" or "yes" TRUNCATE_INTERVAL=10 MAX_DISK_PERC_USED=90 -# Consts (don't adjust these...) +# Constants (don't adjust these) export BASE_DIR RELATIVE_BASE_DIR=`python -c "import os,os.path; print os.path.relpath(os.environ['BASE_DIR'], os.environ['PWD'])"` +export PYTHONPATH=${BASE_DIR}/qpid/python:${BASE_DIR}/qpid/extras/qmf/src/py:${BASE_DIR}/qpid/tools/src/py LOG_FILE_NAME=log.txt QPIDD_FN=qpidd QPIDD=${CMAKE_BUILD_DIR}/src/${QPIDD_FN} -TXTEST=${CMAKE_BUILD_DIR}/src/tests/qpid-txtest +TXTEST_FN=qpid-txtest +TXTEST=${CMAKE_BUILD_DIR}/src/tests/${TXTEST_FN} +ANALYZE_FN=qpid_qls_analyze.py +ANALYZE=${BASE_DIR}/qpid/tools/src/py/${ANALYZE_FN} +ANALYZE_ARGS="--efp --show-recs --stats" QPIDD_BASE_ARGS="--load-module ${STORE_MODULE} -m ${BROKER_MANAGEMENT} --auth no --default-flow-stop-threshold 0 --default-flow-resume-threshold 0 --default-queue-limit 0 --store-dir ${BASE_DIR} --log-enable ${BROKER_LOG_LEVEL} --log-to-stderr no --log-to-stdout no" TXTEST_INIT_STR="--init yes --transfer no --check no" TXTEST_RUN_STR="--init no --transfer yes --check no" @@ -181,6 +187,17 @@ check_ready_to_run() { fi } +# Analyze store files +# $1: Log suffix flag: either "A" or "B". If "A", client is started in test mode, otherwise client evaluates recovery. +analyze_store() { + ${ANALYZE} ${ANALYZE_ARGS} ${BASE_DIR}/qls &> ${RESULT_DIR}/qls_analysis.$1.log + echo >> ${RESULT_DIR}/qls_analysis.$1.log + echo "----------------------------------------------------------" >> ${RESULT_DIR}/qls_analysis.$1.log + echo "With transactional reconsiliation:" >> ${RESULT_DIR}/qls_analysis.$1.log + echo >> ${RESULT_DIR}/qls_analysis.$1.log + ${ANALYZE} ${ANALYZE_ARGS} --txn ${BASE_DIR}/qls &>> ${RESULT_DIR}/qls_analysis.$1.log +} + ulimit -c unlimited # Allow core files to be created RESULT_BASE_DIR_SUFFIX=`date "${TIMESTAMP_FORMAT}"` @@ -219,7 +236,8 @@ for rn in `seq ${NUM_RUNS}`; do sleep ${RUN_TIME} kill_process ${SIG_KILL} ${QPIDD_PID} sleep 2 - tar -czf ${RESULT_DIR}/qls_B.tar.gz ${RELATIVE_BASE_DIR}/qls + analyze_store "A" + tar -czf ${RESULT_DIR}/qls_A.tar.gz ${RELATIVE_BASE_DIR}/qls # === PART B: Recovery and check === start_broker "B" @@ -234,11 +252,14 @@ for rn in `seq ${NUM_RUNS}`; do kill_process ${SIG_KILL} ${PID} sleep 2 fi - tar -czf ${RESULT_DIR}/qls_C.tar.gz ${RELATIVE_BASE_DIR}/qls + analyze_store "B" + tar -czf ${RESULT_DIR}/qls_B.tar.gz ${RELATIVE_BASE_DIR}/qls # === Check for errors, cores and exceptions in logs === grep -Hn "jexception" ${RESULT_DIR}/qpidd.A.log | tee -a ${LOG_FILE} grep -Hn "jexception" ${RESULT_DIR}/qpidd.B.log | tee -a ${LOG_FILE} + grep -Hn "Traceback (most recent call last):" ${RESULT_DIR}/qls_analysis.A.log | tee -a ${LOG_FILE} + grep -Hn "Traceback (most recent call last):" ${RESULT_DIR}/qls_analysis.B.log | tee -a ${LOG_FILE} grep "${SUCCESS_MSG}" ${RESULT_DIR}/txtest.B.log &> /dev/null if [[ "$?" != "0" ]]; then echo "ERROR in run ${rn}" >> ${LOG_FILE} |
