Class: I2CE Hyphen: Difference between revisions
No edit summary |
No edit summary |
||
Line 1: | Line 1: | ||
This article desrcibes the class '''I2CE_Hyphen''' | This article desrcibes the class '''I2CE_Hyphen'''. | ||
It is contained in the module [[iHRIS Module List#textlayout|textlayout]] in the package [https://launchpad.net/textlayout TextLayout Tools] | It is contained in the module [[iHRIS Module List#textlayout|textlayout]] in the package [https://launchpad.net/textlayout TextLayout Tools] | ||
Line 32: | Line 32: | ||
see getWordParts() | see getWordParts() | ||
*Signature: public function HyphenateWord($word,$supress) | *Signature: public function HyphenateWord($word,$supress) | ||
*Returns: [http://www.php.net/manual/en/language.types.array.php array] of int containing the hyphenation points. the hyphenation points are the offsets for begining of each | *Returns: [http://www.php.net/manual/en/language.types.array.php array ] of int containing the hyphenation points. the hyphenation points are the offsets for begining of each | ||
subword. of course, 0 is a hyphenation point. | subword. of course, 0 is a hyphenation point. | ||
Parameters: | Parameters: | ||
* [http://www.php.net/manual/en/language.types.string.php string] $word<br/>the word to be hyphenated | * [http://www.php.net/manual/en/language.types.string.php string ] $word<br/>the word to be hyphenated | ||
* [http://www.php.net/manual/en/language.types.boolean.php bool] $supress<br/>true (default)to suppress hyphenation points at the beginning/end of a word. | * [http://www.php.net/manual/en/language.types.boolean.php bool ] $supress<br/>true (default)to suppress hyphenation points at the beginning/end of a word. | ||
**Default Value: true | **Default Value: true | ||
===LoadHyphenDictionary()=== | ===LoadHyphenDictionary()=== | ||
Line 47: | Line 47: | ||
*Signature: public function LoadHyphenDictionary($file) | *Signature: public function LoadHyphenDictionary($file) | ||
Parameters: | Parameters: | ||
* [http://www.php.net/manual/en/language.types.string.php string] $file<br/>file containing the dictionary | * [http://www.php.net/manual/en/language.types.string.php string ] $file<br/>file containing the dictionary | ||
===Visualize()=== | ===Visualize()=== | ||
Visualize a hyphenation for a word | Visualize a hyphenation for a word | ||
Line 53: | Line 53: | ||
no digits or other special characters (unless they are already in your hypehnation dictionary) | no digits or other special characters (unless they are already in your hypehnation dictionary) | ||
*Signature: public function Visualize($word,$supress) | *Signature: public function Visualize($word,$supress) | ||
*Returns: [http://www.php.net/manual/en/language.types.string.php string] the hyphenated word | *Returns: [http://www.php.net/manual/en/language.types.string.php string ] the hyphenated word | ||
Parameters: | Parameters: | ||
* [http://www.php.net/manual/en/language.types.string.php string] $word<br/>the word that is to be hyphenated | * [http://www.php.net/manual/en/language.types.string.php string ] $word<br/>the word that is to be hyphenated | ||
* [http://www.php.net/manual/en/language.types.boolean.php bool] $supress<br/>true (default)to suppress hyphenation points at the beginning/end of a word. | * [http://www.php.net/manual/en/language.types.boolean.php bool ] $supress<br/>true (default)to suppress hyphenation points at the beginning/end of a word. | ||
**Default Value: TRUE | **Default Value: TRUE | ||
===__construct()=== | ===__construct()=== | ||
Line 71: | Line 71: | ||
subword is a composed of letters (by the Unicode convention) or not. | subword is a composed of letters (by the Unicode convention) or not. | ||
Parameters: | Parameters: | ||
* [http://www.php.net/manual/en/language.types.string.php string] $word<br/>the word we wish to break up | * [http://www.php.net/manual/en/language.types.string.php string ] $word<br/>the word we wish to break up | ||
* [http://www.php.net/manual/en/language.types.boolean.php bool] $supress<br/>true (default)to suppress hyphenation points at the beginning/end of a word. | * [http://www.php.net/manual/en/language.types.boolean.php bool ] $supress<br/>true (default)to suppress hyphenation points at the beginning/end of a word. | ||
**Default Value: true | **Default Value: true | ||
[[Category:Class Documentation]] | [[Category:Class Documentation]] |
Revision as of 21:27, 16 October 2009
This article desrcibes the class I2CE_Hyphen. It is contained in the module textlayout in the package TextLayout Tools
The class is defined in the file: lib/I2CE_Hyphen.php
PHP script implement Knuth's and Liang's hyphenation algorithm as described in http://lingucomponent.openoffice.org/hyphenator.html In particular it uses the 'mashed up' dictionary files Note: Internally, by default, all strings are encoded as UTF-8. This is highly recommended to enable the unicode preg to work quickly (without having to covert to UTF=8 and then back). Note: Does not (yet) support the non-standard hyphenation of hungarian, swedish, etc. @subpackage TextLayout
- Author: Carl Leitner <litlfred@ibiblio.org>
Variables
$enc
protected @var I2CE_Encoding $enc the encoding used for internal storage of strings
- Type: protected $enc
$patterns
An associative array contating the hyphenation patterns
- Type: protected $patterns
$trans
- Type: protected $trans
Methods
HyphenateWord()
Hyphenates a word according to the loaded dictionary WARNING the word is assumed to be only letters. if you need something more general see getWordParts()
- Signature: public function HyphenateWord($word,$supress)
- Returns: array of int containing the hyphenation points. the hyphenation points are the offsets for begining of each
subword. of course, 0 is a hyphenation point. Parameters:
- string $word
the word to be hyphenated - bool $supress
true (default)to suppress hyphenation points at the beginning/end of a word.- Default Value: true
LoadHyphenDictionary()
Load the hyphenation dictionary.
The file is expected to be a 'mashed up' version of a .tex hyphenation dictionary geneareted by using substrings.pl as in the stand-along hyphenation code of http://lingucomponent.openoffice.org/hyphenator.html
- Signature: public function LoadHyphenDictionary($file)
Parameters:
- string $file
file containing the dictionary
Visualize()
Visualize a hyphenation for a word WARNING the word is assumed to have no whitespace or periods and to be only one word no digits or other special characters (unless they are already in your hypehnation dictionary)
- Signature: public function Visualize($word,$supress)
- Returns: string the hyphenated word
Parameters:
- string $word
the word that is to be hyphenated - bool $supress
true (default)to suppress hyphenation points at the beginning/end of a word.- Default Value: TRUE
__construct()
to the specified encoding.
- Signature: public function __construct($enc)
Parameters:
- I2CE_Encoding $enc
specify the encoding the internal storage of this hyphenation dictionaty
getWordParts()
Get the parts of a word which breaks along hyphenation points or any non-letter.
- Signature: public function getWordParts($word,$supress)
- Returns: an the associative array has
a string 'Subword' which tells what the subword is, the int 'Offset' tells where the subword started, the int 'Length' the length of the subword, and the boolean 'IsLetter' which tells us if the subword is a composed of letters (by the Unicode convention) or not. Parameters: