|
|
(20 intermediate revisions by the same user not shown) |
Line 1: |
Line 1: |
| This article desrcibes the class '''I2CE_Hyphen'''.
| | #REDIRECT [[Class: I2CE_Hyphen (4.1.7)]] |
| It is contained in the module [[iHRIS Module List#textlayout|textlayout]] in the package [https://launchpad.net/textlayout TextLayout Tools]
| |
| | |
| The class is defined in the file: [http://bazaar.launchpad.net/~intrahealth+informatics/textlayout/4.0.0-release/annotate/head:/lib/I2CE_Hyphen.php lib/I2CE_Hyphen.php]
| |
| | |
| PHP script implement Knuth's and Liang's hyphenation algorithm
| |
| as described in http://lingucomponent.openoffice.org/hyphenator.html
| |
| In particular it uses the 'mashed up' dictionary files
| |
| Note: Internally, by default, all strings are encoded as UTF-8.
| |
| This is highly recommended to enable the unicode preg to work
| |
| quickly (without having to covert to UTF=8 and then back).
| |
| Note: Does not (yet) support the non-standard hyphenation of hungarian,
| |
| swedish, etc.
| |
| @subpackage TextLayout
| |
| *Author: Carl Leitner <litlfred@ibiblio.org>
| |
| ==Variables==
| |
| ===$enc===
| |
| protected @var I2CE_Encoding $enc the encoding used for internal storage of strings
| |
| *Type: protected $enc
| |
| | |
| ===$patterns===
| |
| An associative array contating the hyphenation patterns
| |
| *Type: protected $patterns
| |
| | |
| ===$trans===
| |
| *Type: protected $trans
| |
| | |
| ==Methods==
| |
| ===HyphenateWord()===
| |
| Hyphenates a word according to the loaded dictionary
| |
| WARNING the word is assumed to be only letters. if you need something more general
| |
| see getWordParts()
| |
| *Signature: public function HyphenateWord($word,$supress)
| |
| *Returns: [http://www.php.net/manual/en/language.types.array.php array ] of int containing the hyphenation points. the hyphenation points are the offsets for begining of each
| |
| subword. of course, 0 is a hyphenation point.
| |
| Parameters:
| |
| * [http://www.php.net/manual/en/language.types.string.php string ] $word<br/>the word to be hyphenated
| |
| * [http://www.php.net/manual/en/language.types.boolean.php bool ] $supress<br/>true (default)to suppress hyphenation points at the beginning/end of a word.
| |
| **Default Value: true
| |
| ===LoadHyphenDictionary()===
| |
| Load the hyphenation dictionary.
| |
| | |
| The file is expected to be a 'mashed up' version of a .tex
| |
| hyphenation dictionary geneareted by using substrings.pl
| |
| as in the stand-along hyphenation code of
| |
| http://lingucomponent.openoffice.org/hyphenator.html
| |
| *Signature: public function LoadHyphenDictionary($file)
| |
| Parameters:
| |
| * [http://www.php.net/manual/en/language.types.string.php string ] $file<br/>file containing the dictionary
| |
| ===Visualize()===
| |
| Visualize a hyphenation for a word
| |
| WARNING the word is assumed to have no whitespace or periods and to be only one word
| |
| no digits or other special characters (unless they are already in your hypehnation dictionary)
| |
| *Signature: public function Visualize($word,$supress)
| |
| *Returns: [http://www.php.net/manual/en/language.types.string.php string ] the hyphenated word
| |
| Parameters:
| |
| * [http://www.php.net/manual/en/language.types.string.php string ] $word<br/>the word that is to be hyphenated
| |
| * [http://www.php.net/manual/en/language.types.boolean.php bool ] $supress<br/>true (default)to suppress hyphenation points at the beginning/end of a word.
| |
| **Default Value: TRUE
| |
| ===__construct()===
| |
| to the specified encoding.
| |
| *Signature: public function __construct($enc)
| |
| Parameters:
| |
| * [[Class: I2CE_Encoding | I2CE_Encoding]] $enc<br/>specify the encoding the internal storage of this hyphenation dictionaty
| |
| ===getWordParts()===
| |
| Get the parts of a word which breaks along hyphenation points or any non-letter.
| |
| *Signature: public function getWordParts($word,$supress)
| |
| *Returns: an the associative array has
| |
| a string 'Subword' which tells what the subword is, the int 'Offset' tells where the subword started,
| |
| the int 'Length' the length of the subword, and the boolean 'IsLetter' which tells us if the
| |
| subword is a composed of letters (by the Unicode convention) or not.
| |
| Parameters:
| |
| * [http://www.php.net/manual/en/language.types.string.php string ] $word<br/>the word we wish to break up
| |
| * [http://www.php.net/manual/en/language.types.boolean.php bool ] $supress<br/>true (default)to suppress hyphenation points at the beginning/end of a word.
| |
| **Default Value: true
| |
| | |
| | |
| [[Category:Class Documentation]]
| |