Welcome to the Question2Answer Q&A. There's also a demo if you just want to try it out.
+2 votes
in Q2A Core by
As I see you have a table named 'qa_words' that stores all individual words from all questions and answers. For example, if I enter a question: "What is the question" then we have 4 record in that table for "What", "is", "the" and "question". Is it a wise idea when the number of questions and answers are rapidly increasing?

1 Answer

+1 vote
edited by
The reasoning for that is better indexing of content and faster searches, I believe.

Actually the one you want to look at is qa_contentwords, which links every word to every post. It will be much bigger than qa_words.

Here are the stats from my site which has been running for several months now:
qa_contentwords = 22.3 MB
qa_posts = 5.3 MB (with > 10,000 posts)
qa_titlewords = 2.2 MB
qa_uservotes = 1.8 MB
qa_words = 2.0 MB

Your opinion may differ but I don't think it's a huge problem since qa_contentwords is only 4x bigger than qa_posts.

There are only so many different words in the world! Some of the stuff will end up being junk but if you run the reindex script after deleting garbage posts, that junk will be removed. IMO this could maybe be improved if the app just ignored the most common words, since they are not useful in searches and will be taking up the largest amount of room.
Thanks for posting the explanation. As you say, it's just not a big deal on today's servers. The reason Q2A doesn't have a fixed common word list is that I want it to be completely language-neutral. It still automatically ignores words that are used more than 10,000 times when searching (set by the QA_IGNORED_WORDS_FREQ constant in qa-config.php).
For the language thing, could you add one string to the language file with a list of common words? e.g.
   'common_words' => 'the,a,i,to,is,it,of....'
Then different languages would list the common words in their language.
Perhaps, but this would require translators to think a little more than is usually necessary...!
asked Jun 12, 2012 in Q2A Core by Why usign qa_words table?