Dividing a japanese sentence into words [Archive] - Japan Forum

PDA

View Full Version : Dividing a japanese sentence into words


Tania-chan
Nov 14, 2006, 21:59
Hello, everyone:

As you all may know, japanese language is written without any space between words (unlike occidental languages, as English and Spanish). I have this sentence:

千葉県鴨川市の山中に、訓練飛行中だった陸上自衛隊木更津駐屯地所属のヘリコプターが墜落し横転、隊員2人 負傷。

... and I've been asked to divide it into words, but it's really difficult, as I don't understand Japanese :(

I used Altavista Babel fish to translate that sentence and the result was:

"In Yamanaka of Chiba prefecture Kamogawa city, the helicopter of the Ground Self-Defense Force Kisarazu camp post which is in the midst of training flying to fall, turning sideways and soldier 2 injury."

(I know it's not a perfect English translation, but it helps a lot xD)


By now, I've done this segmentation:

千葉県 = Chiba prefecture
鴨川市 = Kamonawa city
の = "no" japanese particle
山中 = Yamanaka
に、= "ni" japanese particle
訓練 = training
飛行 = flight
中= in the middle
だった = was
陸上 = ground/land
自衛隊 = Self defence force
木更津 = Kisarazu
駐屯地所属= camp post
の = "no" japanese particle
ヘリコプター = helicopter
が = "ga" japanese particle
墜落し = fall
横転、= turn sideways
隊員 = soldier
2人負傷。= 2 "human" injuries

What do you think of it? Have I done it ok?

Particularly, I have a big doubt with "駐屯地所属", wich means something similar to "camp post". Should I divide it like this?
駐屯地 = camp
所属 = post

... or is there any better solution?

Thanks a lot for your help! :)

Tania

Cue
Nov 14, 2006, 22:24
Just a few corrections/clues:
-山中 = (read as "sanchuu" here), it means "in the mountain".
-2人負傷 = 2 persons got injured/damaged/hurt.
-所属 = belong to?

I'd translate it something like this:
千葉県 鴨川市の 山中に、訓練飛行中だった 陸上自衛隊 木更津 駐屯地 所属の ヘリコプターが墜落し 横転、隊員 2人 負傷。
A helicopter that belongs to Ground Self-Defense Force Kisarazu camp(station?) went down and overturned in the mountain in Kamogawa City, Chiba, while familiarization flight(flight training), and 2 GSDF members injured.

But then again, I'm not that good with English, so...I could be wrong on the vocab.

Hope it helps clearing your doubt. ^_^

Cue

Cue
Nov 14, 2006, 22:53
Ah, I found an English news page on this topic.

I know you only need to divide it, and probablly no need to translate it, but I'll put a link here for your information.
http://www.japantoday.com/jp/news/389561

So, answer to your original question, you did excellent on dividing into words, excpet for just two parts.
駐屯地所属 = 駐屯地 and 所属
2人負傷 = 2人 and 負傷
Other than that, it looks perfect!

Cue

Tania-chan
Nov 14, 2006, 22:57
Ah, I found an English news page on this topic.
I know you only need to divide it, and probablly no need to translate it, but just in case I'll put a link here.

Yeah, I only need to divide it, but thanks for the link ... it's interesting to read that article :)

So, answer to your original question, you did excellent on dividing into words, excpet for just two parts.
Ԓn = Ԓn and
2l = 2l and
Other than that, it looks perfect!

Wow, thank you! I didn't expect such a good result. Thanks a lot :) :)

hkBattousai
Nov 14, 2006, 23:29
I realy don't understand why there is no spaces between words in Japanese.
Isn't it more convenient to use spaces to sperate words?

For example :
I live in a big dormitory away from city.
Iliveinabigdormitoryawayfromcity.
Both are legible of course, but the first one is more easy to read, isn't it?

Cue
Nov 14, 2006, 23:42
Haha, indeed!
But we should ask ancient Chinese people first, then? XD

Tania-chan
Nov 15, 2006, 00:17
I realy don't understand why there is no spaces between words in Japanese.
Isn't it more convenient to use spaces to sperate words?


Yeah, for us non-japanese speakers it would be more convenient, 'cause we are accustomed to writing spaces between words. But maybe Japanese people don't think the same, as they are accustomed to not writing them.

But I think it's an interesting issue. Is there any particular reason why the Chinese decided to write without separating words? Anybody knows about it?

Or was it just a random decission? xD

breez
Nov 15, 2006, 00:25
Word boundaries are more obvious with pictograms than with alphabets. Of course kanji compounds could be a problem in some cases, but context helps?

yukio_michael
Nov 15, 2006, 01:11
I don't think technically that languages like Japanese have word boundaries. At least from a linguistic point of view. This is the same for Korean and Japanese. Instead to parse a Japanese sentence, they use something called n-grams which predict overlapping patterns of characters.

To parse a Japanese sentence by eye, you should look for groups of patterns, set off by verbs and particles as guideposts, usually starting at the end of a sentence...

Specific particles like から (-kara) "because", & (けれど) (-keredo) "although" will connect major ideas in the sentence. Remember these are post-positions unlike English pre-positions, so they follow the concept or word that they modify, in fact they almost attach themselves to it.

Linguistically, I don't think there is a distinction between something like

maikeru desu kedo, and maikeru desukedo... (kedo used here is a coloquialism employed to soften the sentence).

Take a look at this (http://users.tmok.com/~tumble/jpp/pars.html) link, & if you have access to the book, Chapter 66 from "Making Sense of Japanese Grammar" by Zeljko Cipris & SHoko Hamano.

I know some of this doesn't help the original poster who still is tasked with breaking Japanese into "words", but I thought I'd chime in anyways. ;)

Qutiepie
Nov 15, 2006, 03:33
But we should ask ancient Chinese people first, then? XD





Classical Chinese Language dictionary Wen-Yen-Wen has complete listing of characters ancient Chinese applied to indicate sentence and paragraph endings,etc.There was no written form of punctuation marks in ancient China.