Welcome to the Question2Answer Q&A. There's also a demo if you just want to try it out.
+4 votes
939 views
in Q2A Core by
Hi. I recently manually import to qa_posts table more than 8 000 000 question. I need recount related question and everything else but there is a problem. Recounting is very slow. My database is hosted on Google Cloud (8 CPUs, 32 GB RAM) Recounting of  30 000 questions took 2 hours. So whole recounting would be take 22 days.  How can I speed up this process? Thanks a lot.
Q2A version: 1.8

3 Answers

+1 vote
by
Recounting shouldn't be that slow, but reindexing can be very slow.

However, reindexing can be sped up by first emptying (TRUNCATE in SQL) the qa_words, qa_contentwords and qa_titlewords tables. You'll need to put your site in maintenance mode first, otherwise users adding new posts could slow it down again.
by
Sorry my mistake, I mean reindexing.  I truncated all tables you wrote and it is still too slow. My CPU isn't fully utilized. It is at 10%... RAM isn't full and READ/WRITE operations are at 10% too. I think there is some another problem.

Anyway thanks for your answer,  I appreciate it.
by
The number of reindexed questions is incremeanting by 20 only. How can I increase this number or run more instances of this query? Maybe this is the problem...
by
I may have missed a table you need to truncate.
Try emptying all the tables in point 4 here: http://docs.question2answer.org/install/security/
by
No better results...
+1 vote
by

In fact, "Admin" > "Stats" > "Reindex content" is slow. The reason will be that only 10 posts are processed per one Ajax request. Your server seems to have sufficient performance. Therefore, if your server (PHP and MySQL) has sufficient resources (settings), you may be able to speed up by increasing this processing count.

Hacking example of reindex count per one ajax request:

Source: qa-include/app/recalc.php around L133 in case of V1.8

Code:

case 'doreindexcontent_postreindex':
    //$posts=qa_db_posts_get_for_reindexing($next, 10);
    $posts=qa_db_posts_get_for_reindexing($next, 100);

Note:

This process consumes a lot of server resources. Especially in case of shared server, you need to be careful. And, this change does not always work correctly on your server. It is important to gradually increase the value. For example, 50 > 100 > 300 ...

by
Hi, I don't see any improvements even when I use bigger numbers (>10 000). Process doesn't  consume a lot of server resources. CPU and READ/WRITE operations are still at 10%, RAM isn't full.  :(
+1 vote
by
edited by

If extra plugins do not use unindex_post and index_post events, following hack may be effective. You need to make sure that unindex_post and index_post are not processed for all of your plugins.

Hacking example of reindex process per one ajax request:

Source: qa-include/app/recalc.php around L133 in case of V1.8

Code:

// Change from 10 to 1000. You need to adjust according to your server environment.
$posts=qa_db_posts_get_for_reindexing($next, 1000);

if (count($posts)) {
    require_once QA_INCLUDE_DIR.'qa-app-format.php';

    $lastpostid=max(array_keys($posts));

    qa_db_prepare_for_reindexing($next, $lastpostid);
    qa_suspend_update_counts();

    // Remove all indexes by DELETE query. TRUNCATE query will be faster.
    if($next == 0)
        qa_db_truncate_indexes($next);

    foreach ($posts as $postid => $post) {
        //qa_post_unindex($postid); // Reduce DELETE queries
        qa_post_index($postid, $post['type'], $post['questionid'], $post['parentid'], $post['title'], $post['content'],
            $post['format'], qa_viewer_text($post['content'], $post['format']), $post['tags'], $post['categoryid']);
    }

...