Welcome to the Question2Answer Q&A. There's also a demo if you just want to try it out.
  • Register
Welcome to the Q&A for Question2Answer.

If you have questions about the platform, click here to ask and please use English.

If you just want to try Q2A, please use the demo, which also grants admin access.

Apr 29: Q2A 1.5.2

Bug --- repeated dashes in question 'slug' -—- and áccénts ʻ½ʼ ---

+1 vote

If you put repeated dashes in a question title, Q2A doesn't condense them down to one.

Also, accented characters stay the same. This isn't a huge problem but it would be nice to have the option to convert to regular ASCII just to make things simpler. This can be done quite easily with the strtr function,
e.g. $addr = strtr($addr, "äåö", "aao");

So ideally for me, the slug for this question would be:
bug-repeated-dashes-in-question-slug-and-accents

Possibly shortened if words like 'in' or 'and' are removed. What do you think?

EDIT: okay well the dashes issue works fine on here, have you made some changes in the most recent version? The demo site kept the dashes in.

asked Apr 6, 2011 in Q2A Core by DisgruntledGoat

1 Answer

0 votes

Yes, this issue has been improved in the 1.4 developer preview (not yet running on the demo site, since I only want that to show proper releases.)

I do plan to offer more options regarding question URLs in the release of 1.4. As for removing accents, the issue is that so many non-ASCII characters could be mapped to ASCII equivalents, it seems a little daunting to take care of them all. But I'll see if I can find some tables somewhere, or at least take care of European languages.

Finally, short words are removed if the question URL goes beyond 50 characters - see full logic in function qa_q_request(...) in qa-base.php.

answered Apr 6, 2011 by gidgreen
Yep, I understand that many circumstances will warrant keeping the characters in the URL (e.g. Chinese/Japanese letters). I did have a snippet for Latin accented characters at one point, which I will see if I can find.

Another alternative I have used is to simply ignore all non-ascii characters, so symbols like ½ and curly quotes are removed. Basic algorithm:
1. Translate any appropriate characters (é to e and so on) and lowercase the string.
2. Loop through the characters in the source string.
3. If it's in the set [a-z0-9] then add it to the slug, otherwise add a dash (if there is not already one at the end).
4. Remove dashes from the beginning/end of the output string.