Welcome to the Question2Answer Q&A. There's also a demo if you just want to try it out.
+2 votes
531 views
in Q2A Core by
A) qa_contentwords is growing too fast, which funtions of q2a wouldnt work anymore if one empties this table ?

I understand that search wouldnt work ?

 

B) Could the table qa_contentwords be reduced to use only the words of questiontitles for search? Which function would one have to change how ?

 

(Iam working on a page with 200.000 localities in 60.000 categories on 4 levels, performance is brilliant, reindexing fast, surprisingly stable. The only problem is that the qa_contentwords table uses half of the total database space.)
Q2A version: 1.6

1 Answer

+1 vote
by
selected by
 
Best answer
A) Search won't work. Related questions won't work. I think tags will still work as I think there is a separate table for mapping tags to questions, but you should double check that. Can't think of anything else at the moment.

B) Yes, but it would probably require a lot of work modifying Q2A to do that.

To be honest I'm not sure there is such a big problem. Think about it, currently you are saying it's a problem that your database is N megabytes. If you removed the contentwords table, the database would be half the size it is now. But in a year's time with more content added, you will already be back up to that same N megabytes and will be having the same problem.

Storage is cheap, most servers come with hundreds of gigabytes or over a terabyte of storage. If you have hundreds of thousands of questions then you can obviously afford a decent server.
by
Thank You Scott,

that sounds good, I could work without related questions and as well without tags.
But the search I would need.

Regarding the work to change the recalculation: In qa-db-recalc.php are some functions which do fill the table qa_contentwords. I think it would be fully enough to only fill this table with words from question titles, just have no idea still how to get there... So I would have the possibility to search in titles at least.

About the space, this website wont have much income, it is more an informative site about 200.000 schools/locations, where people can comment some things. There wouldnt be additional schools (questions), just answers or comments.

I actually have qa_contentwords at about 530 MB, the shared hosting limits its databases to 1 GB.

However, having an option to choose if content or just titles are searched could be a nice feature for the future...
by
Scott,
thanks again. I deleted all values from qa_contentwords and the LUCKY result is:

Search still works for title words
Tags still work as well.

So, for those who do not need the content for search,
i think this is a good space saver.

It would be still interesting to understand the full function of this table,
but for the moment I have what I wanted...
by
@Scott: "Related questions won't work" → Related questions will work too =)

And i think it's not about storage but performance, no? Having the server crawl through all qa_contentwords... also my nightly automated backup will take shorter and server stays fast.

I think qa_contentwords is only used for the search. As I am using Google Search now (gcse), this table becomes unnecessary. See also http://question2answer.org/qa/23585/turning-search-replacing-with-gcse-help-database-cleaning
...