How can we improve HanziCraft?

More compact and printer friendly Phonetic Set

I'm thinking about making a phonetic set myself in the future, but I don't know when, so perhaps you get a chance to do it sooner. The goal would be for it to fit on one or two A4 pieces of paper so it's easy to use as a reference and to memorize. It doesn't have to be complete, just useful.

I would only use the Degree One set. I would only use components that occur frequently in the real world. This could be based on character frequency. When calculating these statistics, leave out characters that consists of only the component (no point in learning a "rule" if it only applies in one case).

I would also remove components that have multiple pronunciations (above a given occurrence threshold).

I would then put the result in a grid: one column per tone, rows grouped by consonant. Just write the component and pinyin pronunciation, leave out the examples.

It might also make sense to only consider components if they appear on the right or bottom of a character, so you have reasonable confidence that this is the phonetic component (and don't throw away too much in the filter described above). But I'm not sure if that's correct.

Finally I would use pinyin with diacritics rather than numbers.

1 vote
Vote
Sign in
(thinking…)
Sign in with: facebook google
Signed in as (Sign out)
You have left! (?) (thinking…)
Sjors Provoost shared this idea  ·   ·  Flag idea as inappropriate…  ·  Admin →

3 comments

Sign in
(thinking…)
Sign in with: facebook google
Signed in as (Sign out)
Submitting...
  • AdminNiel de la Rouviere (Founder, Niel de la Rouviere) commented  ·   ·  Flag as inappropriate

    Hey Sjors,

    thanks for the great suggestions and comments here.

    That file you generated is great. One thing that you have to keep in mind regarding these phonetic components though. There is still one more variable that describes how useful a phonetic component, besides regularity, and this is called consistency.

    This refers to how "trustworthy" a phonetic component is in providing the phonetic information. For example, if you look at all the possible characters that contain that phonetic component, how many of them have some degree of regularity with the characters. For example even 老 might have 6 characters (老,姥,佬,铑,銠,栳) with degree one regularity, but what about all the other characters where 老 failed to provide that information?
    I'm working on adding this feature and calculations to HanziCraft in the future.

    Sjors, is there anything else you would like me to code or help you with at the moment? I really like the PDF you created. Do you mind if I share it on my Facebook page with other learners?

    I think in the future, I'd like to give more control to users to create these kind of lists themselves. It'll take some time, but it'd be a great feature!

    Kind Regards,
    Niel

  • Sjors Provoost commented  ·   ·  Flag as inappropriate

    I ran my own list through the HSK Level 1 characters. The only two that were useful in a non-trivial way were ⻅ and 中. So I think that either my heuristic wasn't very good or Level 1 words are too simple for phonetic sets to be useful.

  • Sjors Provoost commented  ·   ·  Flag as inappropriate

    I manually condensed the list down to two pages that I printed and hanged on my wall. Perhaps someone else finds this useful too: https://www.dropbox.com/s/zzbnyh0h3k8cgkq/Mandarin%20Phonetic%20Sets%204%2B.pdf

    Steps I took:
    * remove examples
    * remove characters with only 3 examples
    * replace pinyin numbers with diacritics
    * combine homonyms (same pinyin, different character)

Feedback and Knowledge Base