mirror of
				https://github.com/neovim/neovim.git
				synced 2025-10-26 12:27:24 +00:00 
			
		
		
		
	encoding: update documentation
This commit is contained in:
		| @@ -1029,8 +1029,8 @@ A string constant accepts these special characters: | ||||
| \x.	byte specified with one hex number (must be followed by non-hex char) | ||||
| \X..	same as \x.. | ||||
| \X.	same as \x. | ||||
| \u....	character specified with up to 4 hex numbers, stored according to the | ||||
| 	current value of 'encoding' (e.g., "\u02a4") | ||||
| \u....	character specified with up to 4 hex numbers, stored as UTF-8 | ||||
| 	(e.g., "\u02a4") | ||||
| \U....	same as \u but allows up to 8 hex numbers. | ||||
| \b	backspace <BS> | ||||
| \e	escape <Esc> | ||||
| @@ -1045,8 +1045,7 @@ A string constant accepts these special characters: | ||||
| 	utf-8 character, use \uxxxx as mentioned above. | ||||
|  | ||||
| Note that "\xff" is stored as the byte 255, which may be invalid in some | ||||
| encodings.  Use "\u00ff" to store character 255 according to the current value | ||||
| of 'encoding'. | ||||
| encodings.  Use "\u00ff" to store character 255 correctly as UTF-8. | ||||
|  | ||||
| Note that "\000" and "\x00" force the end of the string. | ||||
|  | ||||
| @@ -2532,8 +2531,6 @@ byteidxcomp({expr}, {nr})					*byteidxcomp()* | ||||
| <		The first and third echo result in 3 ('e' plus composing | ||||
| 		character is 3 bytes), the second echo results in 1 ('e' is | ||||
| 		one byte). | ||||
| 		Only works different from byteidx() when 'encoding' is set to | ||||
| 		a Unicode encoding. | ||||
|  | ||||
| call({func}, {arglist} [, {dict}])			*call()* *E699* | ||||
| 		Call function {func} with the items in |List| {arglist} as | ||||
| @@ -2568,11 +2565,11 @@ char2nr({expr}[, {utf8}])					*char2nr()* | ||||
| 		Return number value of the first char in {expr}.  Examples: > | ||||
| 			char2nr(" ")		returns 32 | ||||
| 			char2nr("ABC")		returns 65 | ||||
| <		When {utf8} is omitted or zero, the current 'encoding' is used. | ||||
| 		Example for "utf-8": > | ||||
| 			char2nr("á")		returns 225 | ||||
| 			char2nr("á"[0])		returns 195 | ||||
| <		With {utf8} set to 1, always treat as utf-8 characters. | ||||
| <		Non-ASCII characters are always treated as UTF-8 characters. | ||||
| 		{utf8} has no effect, and exists only for | ||||
| 		backwards-compatibility. | ||||
| 		A combining character is a separate character. | ||||
| 		|nr2char()| does the opposite. | ||||
|  | ||||
| @@ -4225,11 +4222,7 @@ iconv({expr}, {from}, {to})				*iconv()* | ||||
| 		Most conversions require Vim to be compiled with the |+iconv| | ||||
| 		feature.  Otherwise only UTF-8 to latin1 conversion and back | ||||
| 		can be done. | ||||
| 		This can be used to display messages with special characters, | ||||
| 		no matter what 'encoding' is set to.  Write the message in | ||||
| 		UTF-8 and use: > | ||||
| 			echo iconv(utf8_str, "utf-8", &enc) | ||||
| <		Note that Vim uses UTF-8 for all Unicode encodings, conversion | ||||
| 		Note that Vim uses UTF-8 for all Unicode encodings, conversion | ||||
| 		from/to UCS-2 is automatically changed to use UTF-8.  You | ||||
| 		cannot use UCS-2 in a string anyway, because of the NUL bytes. | ||||
| 		{only available when compiled with the |+multi_byte| feature} | ||||
| @@ -4513,9 +4506,7 @@ join({list} [, {sep}])					*join()* | ||||
| json_decode({expr})					*json_decode()* | ||||
| 		Convert {expr} from JSON object.  Accepts |readfile()|-style  | ||||
| 		list as the input, as well as regular string.  May output any  | ||||
| 		Vim value.  When 'encoding' is not UTF-8 string is converted  | ||||
| 		from UTF-8 to 'encoding', failing conversion fails  | ||||
| 		json_decode().  In the following cases it will output  | ||||
| 		Vim value. In the following cases it will output | ||||
| 		|msgpack-special-dict|: | ||||
| 		1. Dictionary contains duplicate key. | ||||
| 		2. Dictionary contains empty key. | ||||
| @@ -4523,33 +4514,22 @@ json_decode({expr})					*json_decode()* | ||||
| 		   dictionary and for string will be emitted in case string  | ||||
| 		   with NUL byte was a dictionary key. | ||||
|  | ||||
| 		Note: function treats its input as UTF-8 always regardless of  | ||||
| 		'encoding' value.  This is needed because JSON source is  | ||||
| 		supposed to be external (e.g. |readfile()|) and JSON standard  | ||||
| 		allows only a few encodings, of which UTF-8 is recommended and  | ||||
| 		the only one required to be supported.  Non-UTF-8 characters  | ||||
| 		are an error. | ||||
| 		Note: function treats its input as UTF-8 always.  The JSON | ||||
| 		standard allows only a few encodings, of which UTF-8 is | ||||
| 		recommended and the only one required to be supported. | ||||
| 		Non-UTF-8 characters are an error. | ||||
|  | ||||
| json_encode({expr})					*json_encode()* | ||||
| 		Convert {expr} into a JSON string.  Accepts  | ||||
| 		|msgpack-special-dict| as the input.  Converts from 'encoding'  | ||||
| 		to UTF-8 when encoding strings.  Will not convert |Funcref|s,  | ||||
| 		|msgpack-special-dict| as the input.  Will not convert |Funcref|s,  | ||||
| 		mappings with non-string keys (can be created as  | ||||
| 		|msgpack-special-dict|), values with self-referencing  | ||||
| 		containers, strings which contain non-UTF-8 characters,  | ||||
| 		pseudo-UTF-8 strings which contain codepoints reserved for  | ||||
| 		surrogate pairs (such strings are not valid UTF-8 strings).   | ||||
| 		When converting 'encoding' is taken into account, if it is not  | ||||
| 		"utf-8", then conversion is performed before encoding strings.   | ||||
| 		Non-printable characters are converted into "\u1234" escapes  | ||||
| 		or special escapes like "\t", other are dumped as-is. | ||||
|  | ||||
| 		Note: all characters above U+0079 are considered non-printable  | ||||
| 		when 'encoding' is not UTF-8.  This function always outputs  | ||||
| 		UTF-8 strings as required by the standard thus when 'encoding'  | ||||
| 		is not unicode resulting string will look incorrect if  | ||||
| 		"\u1234" notation is not used. | ||||
|  | ||||
| keys({dict})						*keys()* | ||||
| 		Return a |List| with all the keys of {dict}.  The |List| is in | ||||
| 		arbitrary order. | ||||
| @@ -4651,9 +4631,9 @@ line2byte({lnum})					*line2byte()* | ||||
| 		Return the byte count from the start of the buffer for line | ||||
| 		{lnum}.  This includes the end-of-line character, depending on | ||||
| 		the 'fileformat' option for the current buffer.  The first | ||||
| 		line returns 1. 'encoding' matters, 'fileencoding' is ignored. | ||||
| 		This can also be used to get the byte count for the line just | ||||
| 		below the last line: > | ||||
| 		line returns 1. UTF-8 encoding is used, 'fileencoding' is | ||||
| 		ignored.  This can also be used to get the byte count for the | ||||
| 		line just below the last line: > | ||||
| 			line2byte(line("$") + 1) | ||||
| <		This is the buffer size plus one.  If 'fileencoding' is empty | ||||
| 		it is the file size plus one. | ||||
| @@ -5172,10 +5152,10 @@ nr2char({expr}[, {utf8}])				*nr2char()* | ||||
| 		value {expr}.  Examples: > | ||||
| 			nr2char(64)		returns "@" | ||||
| 			nr2char(32)		returns " " | ||||
| <		When {utf8} is omitted or zero, the current 'encoding' is used. | ||||
| 		Example for "utf-8": > | ||||
| <		Example for "utf-8": > | ||||
| 			nr2char(300)		returns I with bow character | ||||
| <		With {utf8} set to 1, always return utf-8 characters. | ||||
| <		UTF-8 encoding is always used, {utf8} option has no effect, | ||||
| 		and exists only for backwards-compatibility. | ||||
| 		Note that a NUL character in the file is specified with | ||||
| 		nr2char(10), because NULs are represented with newline | ||||
| 		characters.  nr2char(0) is a real NUL and terminates the | ||||
| @@ -5417,7 +5397,7 @@ py3eval({expr})						*py3eval()* | ||||
| 		converted to Vim data structures. | ||||
| 		Numbers and strings are returned as they are (strings are  | ||||
| 		copied though, Unicode strings are additionally converted to  | ||||
| 		'encoding'). | ||||
| 		UTF-8). | ||||
| 		Lists are represented as Vim |List| type. | ||||
| 		Dictionaries are represented as Vim |Dictionary| type with  | ||||
| 		keys converted to strings. | ||||
| @@ -5467,8 +5447,7 @@ readfile({fname} [, {binary} [, {max}]]) | ||||
| 		Otherwise: | ||||
| 		- CR characters that appear before a NL are removed. | ||||
| 		- Whether the last line ends in a NL or not does not matter. | ||||
| 		- When 'encoding' is Unicode any UTF-8 byte order mark is | ||||
| 		  removed from the text. | ||||
| 		- Any UTF-8 byte order mark is removed from the text. | ||||
| 		When {max} is given this specifies the maximum number of lines | ||||
| 		to be read.  Useful if you only want to check the first ten | ||||
| 		lines of a file: > | ||||
| @@ -6621,8 +6600,7 @@ string({expr})	Return {expr} converted to a String.  If {expr} is a Number, | ||||
| 		for infinite and NaN floating-point values representations  | ||||
| 		which use |str2float()|.  Strings are also dumped literally,  | ||||
| 		only single quote is escaped, which does not allow using YAML  | ||||
| 		for parsing back binary strings (including text when  | ||||
| 		'encoding' is not UTF-8).  |eval()| should always work for  | ||||
| 		for parsing back binary strings.  |eval()| should always work for  | ||||
| 		strings and floats though and this is the only official  | ||||
| 		method, use |msgpackdump()| or |json_encode()| if you need to  | ||||
| 		share data with other application. | ||||
|   | ||||
| @@ -70,29 +70,24 @@ See |mbyte-locale| for details. | ||||
|  | ||||
| ENCODING | ||||
|  | ||||
| If your locale works properly, Vim will try to set the 'encoding' option | ||||
| accordingly.  If this doesn't work you can overrule its value: > | ||||
| Nvim always uses UTF-8 internally. Thus 'encoding' option is always set | ||||
| to "utf-8" and cannot be changed. | ||||
|  | ||||
| 	:set encoding=utf-8 | ||||
| All the text that is used inside Vim will be in UTF-8. Not only the text in | ||||
| the buffers, but also in registers, variables, etc. | ||||
|  | ||||
| See |encoding-values| for a list of acceptable values. | ||||
|  | ||||
| The result is that all the text that is used inside Vim will be in this | ||||
| encoding.  Not only the text in the buffers, but also in registers, variables, | ||||
| etc. 'encoding' is read-only after startup because changing it would make the | ||||
| existing text invalid. | ||||
|  | ||||
| You can edit files in another encoding than what 'encoding' is set to.  Vim | ||||
| You can edit files in different encodings than UTF-8.  Nvim | ||||
| will convert the file when you read it and convert it back when you write it. | ||||
| See 'fileencoding', 'fileencodings' and |++enc|. | ||||
|  | ||||
|  | ||||
| DISPLAY AND FONTS | ||||
|  | ||||
| If you are working in a terminal (emulator) you must make sure it accepts the | ||||
| same encoding as which Vim is working with. | ||||
| If you are working in a terminal (emulator) you must make sure it accepts | ||||
| UTF-8, the encoding which Vim is working with. Otherwise only ASCII can | ||||
| be displayed and edited correctly. | ||||
|  | ||||
| For the GUI you must select fonts that work with the current 'encoding'.  This | ||||
| For the GUI you must select fonts that work with UTF-8.  This | ||||
| is the difficult part.  It depends on the system you are using, the locale and | ||||
| a few other things.  See the chapters on fonts: |mbyte-fonts-X11| for | ||||
| X-Windows and |mbyte-fonts-MSwin| for MS-Windows. | ||||
| @@ -216,10 +211,9 @@ You could make a small shell script for this. | ||||
| ============================================================================== | ||||
| 3.  Encoding				*mbyte-encoding* | ||||
|  | ||||
| Vim uses the 'encoding' option to specify how characters are identified and | ||||
| encoded when they are used inside Vim.  This applies to all the places where | ||||
| text is used, including buffers (files loaded into memory), registers and | ||||
| variables. | ||||
| In Nvim UTF-8 is always used internally to encode characters. | ||||
|  This applies to all the places where text is used, including buffers (files | ||||
|  loaded into memory), registers and variables. | ||||
|  | ||||
| 							*charset* *codeset* | ||||
| Charset is another name for encoding.  There are subtle differences, but these | ||||
| @@ -240,7 +234,7 @@ matter what language is used.  Thus you might see the right text even when the | ||||
| encoding was set wrong. | ||||
|  | ||||
| 							*encoding-names* | ||||
| Vim can use many different character encodings.  There are three major groups: | ||||
| Vim can edit files in different character encodings.  There are three major groups: | ||||
|  | ||||
| 1   8bit	Single-byte encodings, 256 different characters.  Mostly used | ||||
| 		in USA and Europe.  Example: ISO-8859-1 (Latin1).  All | ||||
| @@ -255,11 +249,10 @@ u   Unicode	Universal encoding, can replace all others.  ISO 10646. | ||||
| 		Millions of different characters.  Example: UTF-8.  The | ||||
| 		relation between bytes and screen cells is complex. | ||||
|  | ||||
| Other encodings cannot be used by Vim internally.  But files in other | ||||
| Only UTF-8 is used by Vim internally.  But files in other | ||||
| encodings can be edited by using conversion, see 'fileencoding'. | ||||
| Note that all encodings must use ASCII for the characters up to 128. | ||||
|  | ||||
| Supported 'encoding' values are:			*encoding-values* | ||||
| Recognized 'fileencoding' values include:		*encoding-values* | ||||
| 1   latin1	8-bit characters (ISO 8859-1, also used for cp1252) | ||||
| 1   iso-8859-n	ISO_8859 variant (n = 2 to 15) | ||||
| 1   koi8-r	Russian | ||||
| @@ -311,11 +304,11 @@ u   ucs-4	32 bit UCS-4 encoded Unicode (ISO/IEC 10646-1) | ||||
| u   ucs-4le	like ucs-4, little endian | ||||
|  | ||||
| The {name} can be any encoding name that your system supports.  It is passed | ||||
| to iconv() to convert between the encoding of the file and the current locale. | ||||
| to iconv() to convert between UTF-8 and the encoding of the file. | ||||
| For MS-Windows "cp{number}" means using codepage {number}. | ||||
| Examples: > | ||||
| 		:set encoding=8bit-cp1252 | ||||
| 		:set encoding=2byte-cp932 | ||||
| 		:set fileencoding=8bit-cp1252 | ||||
| 		:set fileencoding=2byte-cp932 | ||||
|  | ||||
| The MS-Windows codepage 1252 is very similar to latin1.  For practical reasons | ||||
| the same encoding is used and it's called latin1.  'isprint' can be used to | ||||
| @@ -337,8 +330,7 @@ u   ucs-2be	same as ucs-2 (big endian) | ||||
| u   ucs-4be	same as ucs-4 (big endian) | ||||
| u   utf-32	same as ucs-4 | ||||
| u   utf-32le	same as ucs-4le | ||||
|     default     stands for the default value of 'encoding', depends on the | ||||
| 		environment | ||||
|     default     the encoding of the current locale. | ||||
|  | ||||
| For the UCS codes the byte order matters.  This is tricky, use UTF-8 whenever | ||||
| you can.  The default is to use big-endian (most significant byte comes | ||||
| @@ -363,13 +355,12 @@ or when conversion is not possible: | ||||
| CONVERSION						*charset-conversion* | ||||
|  | ||||
| Vim will automatically convert from one to another encoding in several places: | ||||
| - When reading a file and 'fileencoding' is different from 'encoding' | ||||
| - When writing a file and 'fileencoding' is different from 'encoding' | ||||
| - When reading a file and 'fileencoding' is different from "utf-8" | ||||
| - When writing a file and 'fileencoding' is different from "utf-8" | ||||
| - When displaying messages and the encoding used for LC_MESSAGES differs from | ||||
|   'encoding' (requires a gettext version that supports this). | ||||
|   "utf-8" (requires a gettext version that supports this). | ||||
| - When reading a Vim script where |:scriptencoding| is different from | ||||
|   'encoding'. | ||||
| - When reading or writing a |shada| file. | ||||
|   "utf-8". | ||||
| Most of these require the |+iconv| feature.  Conversion for reading and | ||||
| writing files may also be specified with the 'charconvert' option. | ||||
|  | ||||
| @@ -408,11 +399,11 @@ Useful utilities for converting the charset: | ||||
|  | ||||
|  | ||||
| 							*mbyte-conversion* | ||||
| When reading and writing files in an encoding different from 'encoding', | ||||
| When reading and writing files in an encoding different from "utf-8", | ||||
| conversion needs to be done.  These conversions are supported: | ||||
| - All conversions between Latin-1 (ISO-8859-1), UTF-8, UCS-2 and UCS-4 are | ||||
|   handled internally. | ||||
| - For MS-Windows, when 'encoding' is a Unicode encoding, conversion from and | ||||
| - For MS-Windows, conversion from and | ||||
|   to any codepage should work. | ||||
| - Conversion specified with 'charconvert' | ||||
| - Conversion with the iconv library, if it is available. | ||||
| @@ -468,8 +459,6 @@ and you will have a working UTF-8 terminal emulator.  Try both > | ||||
| with the demo text that comes with ucs-fonts.tar.gz in order to see | ||||
| whether there are any problems with UTF-8 in your xterm. | ||||
|  | ||||
| For Vim you may need to set 'encoding' to "utf-8". | ||||
|  | ||||
| ============================================================================== | ||||
| 5.  Fonts on X11					*mbyte-fonts-X11* | ||||
|  | ||||
| @@ -864,11 +853,11 @@ between two keyboard settings. | ||||
| The value of the 'keymap' option specifies a keymap file to use.  The name of | ||||
| this file is one of these two: | ||||
|  | ||||
| 	keymap/{keymap}_{encoding}.vim | ||||
| 	keymap/{keymap}_utf-8.vim | ||||
| 	keymap/{keymap}.vim | ||||
|  | ||||
| Here {keymap} is the value of the 'keymap' option and {encoding} of the | ||||
| 'encoding' option.  The file name with the {encoding} included is tried first. | ||||
| Here {keymap} is the value of the 'keymap' option. | ||||
| The file name with "utf-8" included is tried first. | ||||
|  | ||||
| 'runtimepath' is used to find these files.  To see an overview of all | ||||
| available keymap files, use this: > | ||||
| @@ -950,7 +939,7 @@ this is unusual.  But you can use various ways to specify the character: > | ||||
| 	A	<char-0141>	octal value | ||||
| 	x	<Space>		special key name | ||||
|  | ||||
| The characters are assumed to be encoded for the current value of 'encoding'. | ||||
| The characters are assumed to be encoded in UTF-8. | ||||
| It's possible to use ":scriptencoding" when all characters are given | ||||
| literally.  That doesn't work when using the <char-> construct, because the | ||||
| conversion is done on the keymap file, not on the resulting character. | ||||
| @@ -1170,21 +1159,13 @@ Useful commands: | ||||
|   message is truncated, use ":messages"). | ||||
| - "g8" shows the bytes used in a UTF-8 character, also the composing | ||||
|   characters, as hex numbers. | ||||
| - ":set encoding=utf-8 fileencodings=" forces using UTF-8 for all files.  The | ||||
|   default is to use the current locale for 'encoding' and set 'fileencodings' | ||||
|   to automatically detect the encoding of a file. | ||||
| - ":set fileencodings=" forces using UTF-8 for all files.  The | ||||
|   default is to automatically detect the encoding of a file. | ||||
|  | ||||
|  | ||||
| STARTING VIM | ||||
|  | ||||
| If your current locale is in an utf-8 encoding, Vim will automatically start | ||||
| in utf-8 mode. | ||||
|  | ||||
| If you are using another locale: > | ||||
|  | ||||
| 	set encoding=utf-8 | ||||
|  | ||||
| You might also want to select the font used for the menus.  Unfortunately this | ||||
| You might want to select the font used for the menus.  Unfortunately this | ||||
| doesn't always work.  See the system specific remarks below, and 'langmenu'. | ||||
|  | ||||
|  | ||||
| @@ -1245,10 +1226,9 @@ not everybody is able to type a composing character. | ||||
| These options are relevant for editing multi-byte files.  Check the help in | ||||
| options.txt for detailed information. | ||||
|  | ||||
| 'encoding'	Encoding used for the keyboard and display.  It is also the | ||||
| 		default encoding for files. | ||||
| 'encoding'	Internal text encoding, always "utf-8". | ||||
|  | ||||
| 'fileencoding'	Encoding of a file.  When it's different from 'encoding' | ||||
| 'fileencoding'	Encoding of a file.  When it's different from "utf-8" | ||||
| 		conversion is done when reading or writing the file. | ||||
|  | ||||
| 'fileencodings'	List of possible encodings of a file.  When opening a file | ||||
|   | ||||
| @@ -52,7 +52,6 @@ achieve special effects.  These options come in three forms: | ||||
| :se[t] all&		Set all options to their default value.  The values of | ||||
| 			these options are not changed: | ||||
| 			  'columns' | ||||
| 			  'encoding' | ||||
| 			  'lines' | ||||
| 			Warning: This may have a lot of side effects. | ||||
|  | ||||
| @@ -615,7 +614,6 @@ A jump table for the options with a short description can be found at |Q_op|. | ||||
| 			global | ||||
| 			{only available when compiled with the |+multi_byte| | ||||
| 			feature} | ||||
| 	Only effective when 'encoding' is "utf-8" or another Unicode encoding. | ||||
| 	Tells Vim what to do with characters with East Asian Width Class | ||||
| 	Ambiguous (such as Euro, Registered Sign, Copyright Sign, Greek | ||||
| 	letters, Cyrillic letters). | ||||
| @@ -668,7 +666,6 @@ A jump table for the options with a short description can be found at |Q_op|. | ||||
| 	- Set the 'keymap' option to "arabic"; in Insert mode CTRL-^ toggles | ||||
| 	  between typing English and Arabic key mapping. | ||||
| 	- Set the 'delcombine' option | ||||
| 	Note that 'encoding' must be "utf-8" for working with Arabic text. | ||||
|  | ||||
| 	Resetting this option will: | ||||
| 	- Reset the 'rightleft' option. | ||||
| @@ -1078,8 +1075,7 @@ A jump table for the options with a short description can be found at |Q_op|. | ||||
| 			{not available when compiled without the |+linebreak| | ||||
| 			feature} | ||||
| 	This option lets you choose which characters might cause a line | ||||
| 	break if 'linebreak' is on.  Only works for ASCII and also for 8-bit | ||||
| 	characters when 'encoding' is an 8-bit encoding. | ||||
| 	break if 'linebreak' is on.  Only works for ASCII characters. | ||||
|  | ||||
| 						*'breakindent'* *'bri'* | ||||
| 'breakindent' 'bri'	boolean (default off) | ||||
| @@ -1214,11 +1210,9 @@ A jump table for the options with a short description can be found at |Q_op|. | ||||
| 	Specifies details about changing the case of letters.  It may contain | ||||
| 	these words, separated by a comma: | ||||
| 	internal	Use internal case mapping functions, the current | ||||
| 			locale does not change the case mapping.  This only | ||||
| 			matters when 'encoding' is a Unicode encoding, | ||||
| 			"latin1" or "iso-8859-15".  When "internal" is | ||||
| 			omitted, the towupper() and towlower() system library | ||||
| 			functions are used when available. | ||||
| 			locale does not change the case mapping. When | ||||
| 			"internal" is omitted, the towupper() and towlower() | ||||
| 			system library functions are used when available. | ||||
| 	keepascii	For the ASCII characters (0x00 to 0x7f) use the US | ||||
| 			case mapping, the current locale is not effective. | ||||
| 			This probably only matters for Turkish. | ||||
| @@ -1271,13 +1265,12 @@ A jump table for the options with a short description can be found at |Q_op|. | ||||
| 	file to convert from.  You will have to save the text in a file first. | ||||
| 	The expression must return zero or an empty string for success, | ||||
| 	non-zero for failure. | ||||
| 	The possible encoding names encountered are in 'encoding'. | ||||
| 	See |encoding-names| for possible encoding names. | ||||
| 	Additionally, names given in 'fileencodings' and 'fileencoding' are | ||||
| 	used. | ||||
| 	Conversion between "latin1", "unicode", "ucs-2", "ucs-4" and "utf-8" | ||||
| 	is done internally by Vim, 'charconvert' is not used for this. | ||||
| 	'charconvert' is also used to convert the shada file, if 'encoding' is  | ||||
| 	not "utf-8".  Also used for Unicode conversion. | ||||
| 	Also used for Unicode conversion. | ||||
| 	Example: > | ||||
| 		set charconvert=CharConvert() | ||||
| 		fun CharConvert() | ||||
| @@ -1292,8 +1285,6 @@ A jump table for the options with a short description can be found at |Q_op|. | ||||
| 		v:fname_in		name of the input file | ||||
| 		v:fname_out		name of the output file | ||||
| 	Note that v:fname_in and v:fname_out will never be the same. | ||||
| 	Note that v:charconvert_from and v:charconvert_to may be different | ||||
| 	from 'encoding'.  Vim internally uses UTF-8 instead of UCS-2 or UCS-4. | ||||
| 	This option cannot be set from a |modeline| or in the |sandbox|, for | ||||
| 	security reasons. | ||||
|  | ||||
| @@ -2140,44 +2131,14 @@ A jump table for the options with a short description can be found at |Q_op|. | ||||
|  | ||||
|  | ||||
| 					*'encoding'* *'enc'* *E543* | ||||
| 'encoding' 'enc'	string (default: "utf-8") | ||||
| 			global | ||||
| 			{only available when compiled with the |+multi_byte| | ||||
| 			feature} | ||||
| 	Sets the character encoding used inside Vim.  It applies to text in | ||||
| 	the buffers, registers, Strings in expressions, text stored in the | ||||
| 	shada file, etc.  It sets the kind of characters which Vim can work | ||||
| 	with.  See |encoding-names| for the possible values. | ||||
| 'encoding' 'enc'	Removed. |vim-differences| {Nvim} | ||||
| 	Nvim always uses UTF-8 internally. RPC communication | ||||
| 	(remote plugins/GUIs) must use UTF-8 strings. | ||||
|  | ||||
| 	'encoding' cannot be changed after startup, because (1) it causes | ||||
| 	non-ASCII text inside Vim to become invalid, and (2) it complicates | ||||
| 	runtime logic.  The recommended 'encoding' is "utf-8".  Remote plugins | ||||
| 	and GUIs only support utf-8. See |multibyte|. | ||||
|  | ||||
| 	The character encoding of files can be different from 'encoding'. | ||||
| 	The character encoding of files can be different than UTF-8. | ||||
| 	This is specified with 'fileencoding'.  The conversion is done with | ||||
| 	iconv() or as specified with 'charconvert'. | ||||
|  | ||||
| 	If you need to know whether 'encoding' is a multi-byte encoding, you | ||||
| 	can use: > | ||||
| 		if has("multi_byte_encoding") | ||||
| < | ||||
| 	When you set this option, it fires the |EncodingChanged| autocommand | ||||
| 	event so that you can set up fonts if necessary. | ||||
|  | ||||
| 	When the option is set, the value is converted to lowercase.  Thus | ||||
| 	you can set it with uppercase values too.  Underscores are translated | ||||
| 	to '-' signs. | ||||
| 	When the encoding is recognized, it is changed to the standard name. | ||||
| 	For example "Latin-1" becomes "latin1", "ISO_88592" becomes | ||||
| 	"iso-8859-2" and "utf8" becomes "utf-8". | ||||
|  | ||||
| 	When "unicode", "ucs-2" or "ucs-4" is used, Vim internally uses utf-8. | ||||
| 	You don't notice this while editing, but it does matter for the | ||||
| 	|shada-file|.  And Vim expects the terminal to use utf-8 too.  Thus | ||||
| 	setting 'encoding' to one of these values instead of utf-8 only has | ||||
| 	effect for encoding used for files when 'fileencoding' is empty. | ||||
|  | ||||
| 			*'endofline'* *'eol'* *'noendofline'* *'noeol'* | ||||
| 'endofline' 'eol'	boolean	(default on) | ||||
| 			local to buffer | ||||
| @@ -2304,20 +2265,14 @@ A jump table for the options with a short description can be found at |Q_op|. | ||||
| 			feature} | ||||
| 	Sets the character encoding for the file of this buffer. | ||||
|  | ||||
| 	When 'fileencoding' is different from 'encoding', conversion will be | ||||
| 	When 'fileencoding' is different from "utf-8", conversion will be | ||||
| 	done when writing the file.  For reading see below. | ||||
| 	When 'fileencoding' is empty, the same value as 'encoding' will be | ||||
| 	used (no conversion when reading or writing a file). | ||||
| 	Conversion will also be done when 'encoding' and 'fileencoding' are | ||||
| 	both a Unicode encoding and 'fileencoding' is not utf-8.  That's | ||||
| 	because internally Unicode is always stored as utf-8. | ||||
| 		WARNING: Conversion can cause loss of information!  When | ||||
| 		'encoding' is "utf-8" or another Unicode encoding, conversion | ||||
| 		is most likely done in a way that the reverse conversion | ||||
| 		results in the same text.  When 'encoding' is not "utf-8" some | ||||
| 		characters may be lost! | ||||
| 	When 'fileencoding' is empty, the file will be saved with utf-8 | ||||
| 	encoding.  (no conversion when reading or writing a file). | ||||
| 		WARNING: Conversion to a non-Unicode encoding can cause loss of | ||||
| 		information!  | ||||
|  | ||||
| 	See 'encoding' for the possible values.  Additionally, values may be | ||||
| 	See |encoding-names| for the possible values.  Additionally, values may be | ||||
| 	specified that can be handled by the converter, see | ||||
| 	|mbyte-conversion|. | ||||
|  | ||||
| @@ -2330,8 +2285,8 @@ A jump table for the options with a short description can be found at |Q_op|. | ||||
| 	Prepending "8bit-" and "2byte-" has no meaning here, they are ignored. | ||||
| 	When the option is set, the value is converted to lowercase.  Thus | ||||
| 	you can set it with uppercase values too.  '_' characters are | ||||
| 	replaced with '-'.  If a name is recognized from the list for | ||||
| 	'encoding', it is replaced by the standard name.  For example | ||||
| 	replaced with '-'.  If a name is recognized from the list at | ||||
| 	|encoding-names|, it is replaced by the standard name.  For example | ||||
| 	"ISO8859-2" becomes "iso-8859-2". | ||||
|  | ||||
| 	When this option is set, after starting to edit a file, the 'modified' | ||||
| @@ -2354,12 +2309,8 @@ A jump table for the options with a short description can be found at |Q_op|. | ||||
| 	mentioned character encoding.  If an error is detected, the next one | ||||
| 	in the list is tried.  When an encoding is found that works, | ||||
| 	'fileencoding' is set to it.  If all fail, 'fileencoding' is set to | ||||
| 	an empty string, which means the value of 'encoding' is used. | ||||
| 		WARNING: Conversion can cause loss of information!  When | ||||
| 		'encoding' is "utf-8" (or one of the other Unicode variants) | ||||
| 		conversion is most likely done in a way that the reverse | ||||
| 		conversion results in the same text.  When 'encoding' is not | ||||
| 		"utf-8" some non-ASCII characters may be lost!  You can use | ||||
| 	an empty string, which means that UTF-8 is used. | ||||
| 		WARNING: Conversion can cause loss of information! You can use | ||||
| 		the |++bad| argument to specify what is done with characters | ||||
| 		that can't be converted. | ||||
| 	For an empty file or a file with only ASCII characters most encodings | ||||
| @@ -2385,11 +2336,11 @@ A jump table for the options with a short description can be found at |Q_op|. | ||||
| 	because Vim cannot detect an error, thus the encoding is always | ||||
| 	accepted. | ||||
| 	The special value "default" can be used for the encoding from the | ||||
| 	environment.  It is useful when 'encoding' is set to "utf-8" and | ||||
| 	your environment uses a non-latin1 encoding, such as Russian. | ||||
| 	When 'encoding' is "utf-8" and a file contains an illegal byte | ||||
| 	sequence it won't be recognized as UTF-8.  You can use the |8g8| | ||||
| 	command to find the illegal byte sequence. | ||||
| 	environment.  It is useful when your environment uses a non-latin1 | ||||
| 	encoding, such as Russian. | ||||
| 	When a file contains an illegal UTF-8 byte sequence it won't be | ||||
| 	recognized as "utf-8".  You can use the |8g8| command to find the | ||||
| 	illegal byte sequence. | ||||
| 	WRONG VALUES:			WHAT'S WRONG: | ||||
| 		latin1,utf-8		"latin1" will always be used | ||||
| 		utf-8,ucs-bom,latin1	BOM won't be recognized in an utf-8 | ||||
| @@ -3048,8 +2999,7 @@ A jump table for the options with a short description can be found at |Q_op|. | ||||
| 	Note: The size of these fonts must be exactly twice as wide as the one | ||||
| 	specified with 'guifont' and the same height. | ||||
|  | ||||
| 	'guifontwide' is only used when 'encoding' is set to "utf-8" and | ||||
| 	'guifontset' is empty or invalid. | ||||
| 	'guifontwide' is only used when 'guifontset' is empty or invalid. | ||||
| 	When 'guifont' is set and a valid font is found in it and | ||||
| 	'guifontwide' is empty Vim will attempt to find a matching | ||||
| 	double-width font and set 'guifontwide' to it. | ||||
| @@ -3702,7 +3652,7 @@ A jump table for the options with a short description can be found at |Q_op|. | ||||
| 		128 - 159	"~@" - "~_" | ||||
| 		160 - 254	"| " - "|~" | ||||
| 		   255		"~?" | ||||
| 	When 'encoding' is a Unicode one, illegal bytes from 128 to 255 are | ||||
| 	Illegal bytes from 128 to 255 (invalid UTF-8) are | ||||
| 	displayed as <xx>, with the hexadecimal value of the byte. | ||||
| 	When 'display' contains "uhex" all unprintable characters are | ||||
| 	displayed as <xx>. | ||||
| @@ -3980,8 +3930,7 @@ A jump table for the options with a short description can be found at |Q_op|. | ||||
| 			omitted. | ||||
|  | ||||
| 	The characters ':' and ',' should not be used.  UTF-8 characters can | ||||
| 	be used when 'encoding' is "utf-8", otherwise only printable | ||||
| 	characters are allowed.  All characters must be single width. | ||||
| 	be used.  All characters must be single width. | ||||
|  | ||||
| 	Examples: > | ||||
| 	    :set lcs=tab:>-,trail:- | ||||
| @@ -4078,7 +4027,6 @@ A jump table for the options with a short description can be found at |Q_op|. | ||||
| 			{only available when compiled with the |+multi_byte| | ||||
| 			feature} | ||||
| 	The maximum number of combining characters supported for displaying. | ||||
| 	Only used when 'encoding' is "utf-8". | ||||
| 	The default is OK for most languages.  Hebrew may require 4. | ||||
| 	Maximum value is 6. | ||||
| 	Even when this option is set to 2 you can still edit text with more | ||||
| @@ -5825,9 +5773,6 @@ A jump table for the options with a short description can be found at |Q_op|. | ||||
| 	(_xx is an underscore, two letters and followed by a non-letter). | ||||
| 	This is mainly for testing purposes.  You must make sure the correct | ||||
| 	encoding is used, Vim doesn't check it. | ||||
| 	When 'encoding' is set the word lists are reloaded.  Thus it's a good | ||||
| 	idea to set 'spelllang' after setting 'encoding' to avoid loading the | ||||
| 	files twice. | ||||
| 	How the related spell files are found is explained here: |spell-load|. | ||||
|  | ||||
| 	If the |spellfile.vim| plugin is active and you use a language name | ||||
|   | ||||
| @@ -40,7 +40,6 @@ these differences. | ||||
| - 'complete' doesn't include "i" | ||||
| - 'directory' defaults to ~/.local/share/nvim/swap// (|xdg|), auto-created | ||||
| - 'display' defaults to "lastline" | ||||
| - 'encoding' defaults to "utf-8" | ||||
| - 'formatoptions' defaults to "tcqj" | ||||
| - 'history' defaults to 10000 (the maximum) | ||||
| - 'hlsearch' is set by default | ||||
| @@ -159,7 +158,7 @@ are always available and may be used simultaneously in separate plugins.  The | ||||
|    'p')) mkdir() will silently exit. In Vim this was an error. | ||||
| 3. mkdir() error messages now include strerror() text when mkdir fails. | ||||
|  | ||||
| 'encoding' cannot be changed after startup. | ||||
| 'encoding' is always "utf-8". | ||||
|  | ||||
| |string()| and |:echo| behaviour changed: | ||||
| 1. No maximum recursion depth limit is applied to nested container | ||||
| @@ -266,6 +265,7 @@ Highlight groups: | ||||
| Other options: | ||||
|   'antialias' | ||||
|   'cpoptions' ("g", "w", "H", "*", "-", "j", and all POSIX flags were removed) | ||||
|   'encoding' ("utf-8" is always used) | ||||
|   'guioptions' "t" flag was removed | ||||
|   *'guipty'* (Nvim uses pipes and PTYs consistently on all platforms.) | ||||
|   *'imactivatefunc'* *'imaf'* | ||||
|   | ||||
		Reference in New Issue
	
	Block a user
	 Björn Linse
					Björn Linse