Hi,
I'm starting to build a database that will contain English and Welsh
text. Up until now I have just used the default collation settings, and
everything has been fine - but I've only ever needed to use English.
There is also the possibility that, at some point in the future, the
database will have to handle several Arabic and Asian languages.
So, my question is what collation should I used to make the database as
future-proof as possible given the statements above? Does, for example,
the default collation actually support the Welsh alphabet? (Welsh has
four extra characters that I don't think appear in other European
languages - ŵ, ŷ and their upper case equivalents - possibly more, but I
don't actually know Welsh!) I notice that there isn't a bog-standard
"UTF-8" option in the collation list...
Lastly, does anyone have any general tips regarding multi-lingual databases?
Cheers,
--
Dylan Parry
http://electricfreedom.org | http://webpageworkshop.co.uk
The opinions stated above are not necessarily representative of
those of my cats. All opinions expressed are entirely your own.Hi Dylan,
Basically you collation should be the local environment where it is. So if
its in the UK it should be english etc.
With regards to your text, what you need to do is have the appropriate
datatype's for your tables.
For storing text that you would want as not only english, but another
language, instead of using a datatype of say char(10) you would use
nchar(10) this N, ensures that the type is unicode and not ANSI. Text
conversions of char can have mistakes, yet nchar represents the characters
as they should. One important note, using nchar instead of char, will
consume twice as much space as char, so just something to bear in mind.
hth
Barry Andrew
"Dylan Parry" <usenet@.dylanparry.com> wrote in message
news:460d0c87$0$761$bed64819@.news.gradwell.net...
> Hi,
> I'm starting to build a database that will contain English and Welsh
> text. Up until now I have just used the default collation settings, and
> everything has been fine - but I've only ever needed to use English.
> There is also the possibility that, at some point in the future, the
> database will have to handle several Arabic and Asian languages.
> So, my question is what collation should I used to make the database as
> future-proof as possible given the statements above? Does, for example,
> the default collation actually support the Welsh alphabet? (Welsh has
> four extra characters that I don't think appear in other European
> languages - ŵ, ŷ and their upper case equivalents - possibly more, but I
> don't actually know Welsh!) I notice that there isn't a bog-standard
> "UTF-8" option in the collation list...
> Lastly, does anyone have any general tips regarding multi-lingual
> databases?
> Cheers,
> --
> Dylan Parry
> http://electricfreedom.org | http://webpageworkshop.co.uk
> The opinions stated above are not necessarily representative of
> those of my cats. All opinions expressed are entirely your own.|||Barry Andrew Hall wrote:
> Hi Dylan,
> Basically you collation should be the local environment where it is.
> So if its in the UK it should be english etc.
Ah okay - so I should simply leave it "as is", right?
> With regards to your text, what you need to do is have the
> appropriate datatype's for your tables.
> For storing text that you would want as not only english, but another
> language, instead of using a datatype of say char(10) you would use
> nchar(10) this N, ensures that the type is unicode and not ANSI.
> Text conversions of char can have mistakes, yet nchar represents the
> characters as they should.
Right, I understand now. So instead of using "text" to store several
paragraphs of writing, I should use "ntext" - but only in places that
could contain Welsh text.
> One important note, using nchar instead of char, will consume twice
> as much space as char, so just something to bear in mind.
In much the same way as saving a text file in UTF-8 instead of
ISO-8859-1 will consume more space. That's not too much of an issue for
me, although I guess it could cause degradation in performance?
--
Dylan Parry
http://electricfreedom.org | http://webpageworkshop.co.uk
The opinions stated above are not necessarily representative of
those of my cats. All opinions expressed are entirely your own.|||Your right on the performance. But to be fair, if its designed solidly you
shouldnt have a thing to worry about.
Also, as you said, anything you want potentially in another language, have a
datatype of nvarchar(max) thats probably your best bet.
"Dylan Parry" <usenet@.dylanparry.com> wrote in message
news:460d1330$0$757$bed64819@.news.gradwell.net...
> Barry Andrew Hall wrote:
>> Hi Dylan,
>> Basically you collation should be the local environment where it is.
>> So if its in the UK it should be english etc.
> Ah okay - so I should simply leave it "as is", right?
>> With regards to your text, what you need to do is have the
>> appropriate datatype's for your tables.
>> For storing text that you would want as not only english, but another
>> language, instead of using a datatype of say char(10) you would use
>> nchar(10) this N, ensures that the type is unicode and not ANSI.
>> Text conversions of char can have mistakes, yet nchar represents the
>> characters as they should.
> Right, I understand now. So instead of using "text" to store several
> paragraphs of writing, I should use "ntext" - but only in places that
> could contain Welsh text.
>> One important note, using nchar instead of char, will consume twice
>> as much space as char, so just something to bear in mind.
> In much the same way as saving a text file in UTF-8 instead of
> ISO-8859-1 will consume more space. That's not too much of an issue for
> me, although I guess it could cause degradation in performance?
> --
> Dylan Parry
> http://electricfreedom.org | http://webpageworkshop.co.uk
> The opinions stated above are not necessarily representative of
> those of my cats. All opinions expressed are entirely your own.