Hello!
This is my first post. Please let me know if it is not appropriate to ask it here.
A little bit of background
I am creating a csl style for Japanese and English. I use condition (existence of a particular field in extra) to determine the language. The value of that field (Japanese Hiragana) will also serve as a key for sorting the bibliography. [Not valid csl but it gets the job done. If you have any recommendations (used on Zotero), I would be grateful]
The most important part
In the piece of code below, I want to have a prefix (space) in front of “Firstname” (or suffix with “Lastname”) but it does not render. Formatting such as underline renders perfectly. The journal requires to add a space between “family” and “given” in Japanese.
I am using Zotero 6. The file is based on Chicago author-date.
CJK names are specially handled in citeproc-js. In the following example, I’ve tried sort-separator and affixes but they don’t work. Only font-weight="bold" attribute works and it can confirm the output is from this element.
As far as I know, this is hard coded in the citeproc-js engine and this is no way to change this without modifying the JavaScript code.
Yes, it seems impossible using affixes. I also tried other formatting and underline and bold worked. Only affixing does not work.
This is a vital functionality, especially when we use non-Japanese author names (transliterated into katakana) that need separators when used in their original order (1). When used in “Family - Given” order, we will also need another kind of separator (2). It seems consistent enough, which is why I thought a rule or an option must exist somewhere but I am just not aware.
マイケル・ジャクソン (Michael Jackson) [Original order]
ジャクソン、マイケル (Jackson, Michael) [Inverted order]
This is so common that even hardcoding would benefit many. E.g. When the name is in Katakana (Unicode 30a0 to 30ff), add “、” for (Family, Given) and add “・” for (Given Family). Then add a global option to activate this for example.
Still, I will need affixing to put a space between Japanese (not transliterated into katakana) name-parts for this particular style I am working on.
Do you know where I could ask (feature request or something similar)? I may also be able to contribute with JavaScript if there is lack of contributors.
I’m sorry for the long reply but I am new and do not yet know where to ask what.
@frianasoa
I’m interested in this topic also since I have tried creating a Chicago-based style for English and Japanese.
I think it’s possible to modify a part of citeproc-js. If there is a way to replace the standard citeproc-js deployed as a part of Zotero with a customized one, I want to try creating citeproc-js specialized for researchers who write in English and Japanese.
It could be nice if there were a language-specific option in Zotero and citeproc-js. However, due to the diversity of languages and needs, I guess proposing a change as a part of the standard is a little complicated.
I’m interested in this topic also since I have tried creating a Chicago-based style for English and Japanese.
Thanks for the link. Glad to see other people working on Japanese. It seems that we have hit the same limitations with the Katakana transliteration of foreign names.
I want to try creating citeproc-js specialized for researchers who write in English and Japanese.
I wonder how many of us are trying to do this. May be others have given up due to those limitations. I found another thread that touched on this on here but there does not seem to be any solution there.
I think hardcoding ja and zh comes from the assumption that the order will always be “Family-Given”, without expecting foreign names in Katakana. This, however, makes it quite useless in Japanese (It is a pain to have to edit by hand before publication).
proposing a change as a part of the standard is a little complicated
My understanding of “hardcoding” is that we do it when there is not enough time (contributors) to implement the logic. I suspect that if this gets much attention, it could be included in the standard.
It could be nice if there were a language-specific option in Zotero and citeproc-js.
I want to believe that hardcoding names for ja and zh was already an attempt for a language-specific option. I think if we know the right people to contact (sorry, new here) with the right proposition, it could end up in the standard in a way that would benefit everyone.
Sorry for the long response. I will try to look at citeproc-js when I have time and see how I could improve my proposition above. It would be nice if we could work together on this.
I agree, and I had a similar experience many times. Editing by hand is a workaround but also a cause of errors.
BTW, how do you use CSL to write your paper? Write in MS Word with the Zotero plugin or write with TeX?
Yes, you are right. I tried to propose, but my heart broke since there were no reactions then.
Yes, let’s discuss and work together. If we propose a change that will benefit many people and support multiple CSLs, we can expect the change to be accepted.
To make sure, what’s not possible with the current citeproc-js implementation is as follows?
Put a space between the family name and the first name for Japanese names containing Kanji.
Put a “・” between the family name and the first name for translated names in Katakana.
Just in case you folks aren’t aware, some of the (non-CSL-spec) extensions you see in citeproc-js are indeed an attempt by its author (@Frank_Bennett) to add multi-lingual support, as he is based in Japan.
I’m not sure how their process works in terms of pull requests, but ideally that would be the right path. You could maybe post an issue there first and ask whether they support adding the feature, and how they want to do it?
As for CSL, you’d want to put in a feature request at the schemas repo, though whether, when and how it might be added is currently unclear.
I’m actually working on an experimental project that will ultimately have enhanced multi-lingual support, so I’m hoping to include this as well.
Don’t be too discouraged with the lack of response; things are sometimes quiet here.
Maybe you two can do as you suggest: try to figure out what you’d like to see in terms of how this should, and generally, be configured in CSL? You could include example fragments in the issue on github, or in this thread if you prefer?
@Bruce_D_Arcus1
Thank you for your valuable input. I have been playing with citeproc-js for a while now. I will post my suggestions here first for feedback (maybe not sooner than 3 weeks though) before posting an issue and a pull request.
Changes are mainly in: attributes.js, state.js, load.js, util_names_render.js
Detect Japanese by language-name (_isJapanese), detect katakana by script (_isKatakana)
If NOT romanesque AND (katakana or Japanese), run the function _renderJapaneseName
If romanesque AND (katakana or Japanese), run the function _renderJapaneseName
The code is as follows
if (romanesque === 0) {
if((katakana===1 && this.state.opt["katakana-display"]!=="legacy-order") || japanese===1){
blob = this._renderJapaneseName(japanese, katakana, family, given, i, j, sort_sep);
} else {
// XXX handle affixes for given and family
blob = this._join([non_dropping_particle, family, given], "");
}
}
else if((katakana===1 && this.state.opt["katakana-display"]!=="legacy-order") || japanese===1){
/*
Sometimes katakana can be partially romanesque when mixed with initials.
Initializing katakana is difficult programatically.
*/
blob = this._renderJapaneseName(japanese, katakana, family, given, i, j, sort_sep);
}
About _renderJapaneseName
Here are some assumptions that I want to check
I assume katakana is normally displayed メイ、セイ (name-as-sort-order) and セイ・メイ (Family, Given and Given・Family).
I assume Japanese does not need & or “and” in 斉藤&山田 or 斉藤、山田&田中. This way if we add “and” to the style, it will be used by roman author names only.
If we are to hardcode a default, I think it should be these two but I want to check if there is any problem with that.
I have added the option “katakana-display”, with default value “normal-order” (the above assumptions). If we add the value “legacy-order”, it will behave like it used to be (for existing styles that may rely on this).
If I add “katakana-display” as an option, Zotero will not recognize the style and would not work.
I would appreciate if you could try this code with your styles (or other styles in Japanese) and give feedback. My styles are under “csl-files/”. Other inputs are welcome as well.
I will file an issue and do a pull request once we find something stable.
If you try my csl styles, you will need to add name-kana:なまえ in note/extra for Japanese to display properly. set the language name to ja when you have japanese information with romanesque author names (common for translated work) [see fixtures/styles/].
You could consider opening up a draft pull request for citeproc-js on github? That’s typically what I do when I want to get feedback, but am still working on a feature branch.
This isn’t an area where I have much expertise, but my obvious question would be about generalizing it beyond Japanese; how would that work?
The threading works seem to have lost the context, but I’ve never encountered a need for affixes (like Jr or III) in Japanese. It’s perfectly possible that I missed something there, but in any case the CSL-M spec and the processor do not anticipate them.
You could consider opening up a draft pull request for citeproc-js on github?
Thanks for your suggestion. I will do that later.
my obvious question would be about generalizing it beyond Japanese; how would that work?
I have not thought of generalizing. Since parts of zh and ja were already hardcoded, I just thought of a way to give more freedom in the csl use, or a better default if we need to do hardcoding. The function _renderJapaneseName could be written as _renderCJKName if there are common issues in Chinese and/or Korean regarding the default, but this will need inputs from people who know the other two languages.
The threading works seem to have lost the context, but I’ve never encountered a need for affixes (like Jr or III) in Japanese
I am sorry the title (my initial intent) is confusing. I do not know if I need to open a new thread but to summarize, I am interested in the following topic.
→ “Rendering contributors in Kanji, Katakana, and English (with different styles) in one Japanese document”
I do not know if this already has a solution but I am detailing my problem below.
It’s perfectly possible that I missed something there, but in any case the CSL-M spec and the processor do not anticipate them.
At first, I was asking for the ability to render name-part affixes because I wanted to have this punctuation 「・」(中黒) between first name and family name for katakana, without manually tweaking the entry. But I realized that was not my only problem when rendering Japanese. So I proposed the changes above. I will do a pull request once I am sure my proposal makes sense for Japanese users .
As far as I know, when we write a Japanese article, we mainly encounter the four kinds of contributors below. Tweaking the csl file, I could not achieve (2) and (3) at the same time as the others.
McClure, Arthur F., James R. Chrisman & Perry Mock(1985)Education for Work: The Historical Evolution of Vocational and Distributive Education in America. New Jersey: Associated University Press.
I have tried to play with csl files and got purely Japanese entries (1) and purely English entries (4) with the help of conditions. However, I could never get the other forms.
Solving this is what I am proposing in the code below.
Thanks for sending this message and letting me know about newer developments.
It is long ago, and I am not particularly “in” the topic. If you have concrete questions, don’t hesitate to ask. I have developed my own solution for the time being, it is not great but works; and I wish to support anybody working on that problem.
@frianasoa
Thank you very much for proposing the fix. I’m sorry for the delay in reply. I want to leave a comment regarding the following topic before sharing the result of the testing.
BTW, the book doesn’t exist. That was a dummy entry created by me.
Recently, a part of publishers tend to keep non-Japanese names as is. For instance, I supervised the translation of the following book. And we didn’t translate the original author’s name (i.e., Graham Pullin) by Katakana (e.g., グラハム・プリン or グレアム・プリン).
I don’t think this is the standard at the current moment. The publisher’s primary focus is technical books, and keeping original names as is is not a problem for readers.
I also know problems regarding inconsistency in Katakana notation. For example, there are variations for the same person, Martin Heidegger, a well-known German philosopher.
マルティン・ハイデッガー
マルティン・ハイデガー
Inconsistencies in Katakana notation can confuse, and depending on the field, the original notation may be used instead of Katakana in the future. Of course, I do not know what will happen. Anyway, I hope that this reply helps to understand the context.
I’ll run the test and report here as soon as possible.
@frianasoa
Hi. I tested the updated citeproc-js in the following steps.
What I got was as follows.
Morgan, C. & Patrov, A. (2022=2023). 自由意志の哲学 Translated by 鈴木真紀. 東京: 城南大学出版会.
What I expected was as follows.
Since I tested the updated citeproc-js via my tool (i.e., citeproc-js-based-replacer), this issue might be caused by my tool. I’m sorry to trouble you, but please let me know how to test the updated citeproc-js directly. Thank you very much in advance.
I also know problems regarding inconsistency in Katakana notation. For example, there are variations for the same person, Martin Heidegger, a well-known German philosopher.
Yes, now I see the point of not transliterating authors names in katakana.
I tested the updated citeproc-js in the following steps. …
Thanks for testing.
Actualy, if you test using one of my Japanese csl, you will need to add “name-kana” in your bibliography.json. It will signal that the entry should be rendered in Japanese. It will also sort the bibliography using that field.
I named it “name-kana” because I expected Japanese but the values do not have to be kana. I am still looking for a better field name.
Could you try one (or some) of your styles with some of your bibliography entries? It will be helpfull if we could think of every possible name display in a Japanese document.
Katakana (with punctuations or not), Original name in English (or other language), Japanese, (single, double, multiple authors)
I will also find some time to test this and report here if I find any issues. I hope to file an issue on GitHub and submit a pull request soon.
@Frank_Bennett I mainly copied you because I’m unsure of the status of of citeproc-js these days; whether it makes sense, for example, for @frianasoa to submit a PR there.