Parser Class Reference
[Parser]

PHP Parser - Processes wiki markup (which uses a more user-friendly syntax, such as "[[link]]" for making links), and provides a one-way transformation of that wiki markup it into XHTML output / markup (which in turn the browser understands, and can display). More...

Inherited by Parser_LinkHooks.

List of all members.

Public Member Functions

 __construct ($conf=array())
 #@-
 __destruct ()
 Reduce memory usage to reduce the impact of circular references.
 firstCallInit ()
 Do various kinds of initialisation on the first call of the parser.
 setOutputType ($ot)
 setTitle ($t)
 Set the context title.
 uniqPrefix ()
 Accessor for mUniqPrefix.
 parse ($text, Title $title, ParserOptions $options, $linestart=true, $clearState=true, $revid=null)
 Convert wikitext to HTML Do not call this function recursively.
 recursiveTagParse ($text)
 Recursive parser entry point that can be called from an extension tag hook.
 preprocess ($text, $title, $options, $revid=null)
 Expand templates and variables in the text, producing valid, static wikitext.
getTitle ()
 getOptions ()
 getRevisionId ()
 getOutput ()
 nextLinkID ()
 getFunctionLang ()
 getPreprocessor ()
 Get a preprocessor object.
 getStripList ()
 Get a list of strippable XML-like elements.
 strip ($text, $state, $stripcomments=false, $dontstrip=array())
 unstripForHTML ($text)
 magicLinkCallback ($m)
 doQuotes ($text)
 Helper function for doAllQuotes().
 getExternalLinkAttribs ()
 makeLinkHolder (&$nt, $text= '', $query= '', $trail= '', $prefix= '')
 Make a link placeholder.
 makeKnownLinkHolder ($nt, $text= '', $query= '', $trail= '', $prefix= '')
 Render a forced-blue link inline; protect against double expansion of URLs if we're in a mode that prepends full URL prefixes to internal links.
 armorLinks ($text)
 Insert a NOPARSE hacky thing into any inline links in a chunk that's going to go through further parsing steps before inline URL expansion.
 areSubpagesAllowed ()
 Return true if subpage links should be expanded on this page.
 getCommon ($st1, $st2)
 openList ($char)
 nextItem ($char)
 closeList ($char)
 findColonNoLinks ($str, &$before, &$after)
 Split up a string on ':', ignoring any occurences inside tags to prevent illegal overlapping.
 limitationWarn ($limitationType, $current=null, $max=null)
 Warn the user when a parser limitation is reached Will warn at most once the user per limitation type.
 getTemplateDom ($title)
 Get the semi-parsed DOM representation of a template with a given title, and its redirect destination title.
 fetchTemplateAndTitle ($title)
 Fetch the unparsed text of a template and register a reference to it.
 fetchTemplate ($title)
 interwikiTransclude ($title, $action)
 Transclude an interwiki link.
 fetchScaryTemplateMaybeFromCache ($url)
 extensionSubstitution ($params, $frame)
 Return the text to be used for a given extension tag.
 incrementIncludeSize ($type, $size)
 Increment an include size counter.
 incrementExpensiveFunctionCount ()
 Increment the expensive function count.
 doDoubleUnderscore ($text)
 Strip double-underscore items like __NOGALLERY__ and __NOTOC__ Fills $this->mDoubleUnderscores, returns the modified text.
 preSaveTransform ($text, &$title, $user, $options, $clearState=true)
 Transform wiki markup when saving a page by doing
->
conversion, substitting signatures, {{subst:}} templates, etc.
 validateSig ($text)
 Check that the user's signature contains no bad XML.
 cleanSig ($text, $parsing=false)
 Clean up signature text.
 cleanSigInSig ($text)
 Strip ~~~, ~~~~ and ~~~~~ out of signatures.
 startExternalParse (&$title, $options, $outputType, $clearState=true)
 Set up some variables which are usually set up in parse() so that an external function can call some class members with confidence.
 transformMsg ($text, $options)
 Wrapper for preprocess().
 setHook ($tag, $callback)
 Create an HTML-style tag, e.g.
 setTransparentTagHook ($tag, $callback)
 clearTagHooks ()
 Remove all tag hooks.
 setFunctionHook ($id, $callback, $flags=0)
 Create a function, e.g.
 getFunctionHooks ()
 Get all registered function hook identifiers.
 replaceLinkHolders (&$text, $options=0)
 Replace link placeholders with actual links, in the buffer Placeholders created in Skin::makeLinkObj() Returns an array of link CSS classes, indexed by PDBK.
 replaceLinkHoldersText ($text)
 Replace link placeholders with plain text of links (not HTML-formatted).
 renderPreTag ($text, $attribs)
 Tag hook handler for 'pre'.
 renderImageGallery ($text, $params)
 Renders an image gallery from a text with one line per image.
 getImageParams ($handler)
 makeImage ($title, $options, $holders=false)
 Parse image options text and use it to make an image.
 disableCache ()
 Set a flag in the output object indicating that the content is dynamic and shouldn't be cached.
 Title ($x=NULL)
 #@-
 Options ($x=NULL)
 OutputType ($x=NULL)
 getTags ()
 #@-
 getSection ($text, $section, $deftext='')
 This function returns the text of a section, specified by a number ($section).
 replaceSection ($oldtext, $section, $text)
 getRevisionTimestamp ()
 Get the timestamp associated with the current revision, adjusted for the default server-local timestamp.
 setDefaultSort ($sort)
 Mutator for $mDefaultSort.
 getDefaultSort ()
 Accessor for $mDefaultSort Will use the title/prefixed title if none is set.
 getCustomDefaultSort ()
 Accessor for $mDefaultSort Unlike getDefaultSort(), will return false if none is set.
 guessSectionNameFromWikiText ($text)
 Try to guess the section anchor name based on a wikitext fragment presumably extracted from a heading, for example "Header" from "== Header ==".
 stripSectionName ($text)
 Strips a text string of wikitext for use in a section anchor.
 srvus ($text)
 testSrvus ($text, $title, $options, $outputType=self::OT_HTML)
 strip/replaceVariables/unstrip for preprocessor regression testing
 testPst ($text, $title, $options)
 testPreprocess ($text, $title, $options)
 markerSkipCallback ($s, $callback)

Static Public Member Functions

 extractTagsAndParams ($elements, $text, &$matches, $uniq_prefix= '')
 Replaces all occurrences of HTML-style comments and the given tags in the text with a random marker and returns the next text.
 tidy ($text)
 Interface with html tidy, used if $wgUseTidy = true.
static replaceUnusualEscapes ($url)
 Replace unusual URL escape codes with their equivalent characters.
static splitWhitespace ($s)
static createAssocArgs ($args)
 Clean up argument array - refactored in 1.9 so parserfunctions can use it, too.
static statelessFetchTemplate ($title, $parser=false)
 Static function to get a template Can be overridden via ParserOptions::setTemplateCallback().

Public Attributes

const VERSION = '1.6.4'
 Update this version number when the ParserOutput format changes in an incompatible way, so the parser cache can automatically discard old data.
const SFH_NO_HASH = 1
const SFH_OBJECT_ARGS = 2
const EXT_LINK_URL_CLASS = '[^][<>"\\x00-\\x20\\x7F]'
const EXT_IMAGE_REGEX
const COLON_STATE_TEXT = 0
const COLON_STATE_TAG = 1
const COLON_STATE_TAGSTART = 2
const COLON_STATE_CLOSETAG = 3
const COLON_STATE_TAGSLASH = 4
const COLON_STATE_COMMENT = 5
const COLON_STATE_COMMENTDASH = 6
const COLON_STATE_COMMENTDASHDASH = 7
const PTD_FOR_INCLUSION = 1
const OT_HTML = 1
const OT_WIKI = 2
const OT_PREPROCESS = 3
const OT_MSG = 3
const MARKER_SUFFIX = "-QINU\x7f"
 $mOutput
 $mAutonumber
 $mDTopen
 $mStripState
 $mIncludeCount
 $mArgStack
 $mLastSection
 $mInPre
 $mLinkHolders
 $mLinkID
 $mIncludeSizes
 $mPPNodeCount
 $mDefaultSort
 $mTplExpandCache
 $mTplRedirCache
 $mTplDomCache
 $mHeadings
 $mDoubleUnderscores
 $mExpensiveFunctionCount
 $mFileCache
 $mOptions
 $mTitle
 $mOutputType
 $ot
 $mRevisionId
 $mRevisionTimestamp
 $mRevIdForTs

Protected Member Functions

 stripAltText ($caption, $holders)

Private Member Functions

 clearState ()
 Clear Parser state.
 unstrip ($text, $state)
 Restores pre, math, and other extensions removed by strip().
 unstripNoWiki ($text, $state)
 Always call this after unstrip() to preserve the order.
 insertStripItem ($text)
 Add an item to the strip state Returns the unique tag which must be inserted into the stripped text The tag will be replaced with the original text in unstrip().
 doTableStuff ($text)
 parse the wiki syntax used to render tables
 internalParse ($text)
 Helper function for parse() that transforms wiki markup into HTML.
 doMagicLinks ($text)
 Replace special strings like "ISBN xxx" and "RFC xxx" with magic external links.
 makeFreeExternalLink ($url)
 Make a free external link, given a user-supplied URL.
 doHeadings ($text)
 Parse headers and return html.
 doAllQuotes ($text)
 Replace single quotes with HTML markup.
 replaceExternalLinks ($text)
 Replace external links (REL).
 maybeMakeExternalImage ($url)
 make an image if it's allowed, either through the global option, through the exception, or through the on-wiki whitelist
 replaceInternalLinks ($s)
 Process [[ ]] wikilinks.
 replaceInternalLinks2 (&$s)
 Process [[ ]] wikilinks (RIL).
 maybeDoSubpageLink ($target, &$text)
 Handle link to subpage if necessary.
 closeParagraph ()
 #@+ Used by doBlockLevels()
 doBlockLevels ($text, $linestart)
 #@-
 getVariableValue ($index)
 Return value of a magic variable (like PAGENAME).
 initialiseVariables ()
 initialise the magic variables (like CURRENTMONTHNAME)
 preprocessToDom ($text, $flags=0)
 Preprocess some wikitext and return the document tree.
 replaceVariables ($text, $frame=false, $argsOnly=false)
 Replace magic variables, templates, and template arguments with the appropriate text.
 braceSubstitution ($piece, $frame)
 Return the text of a template, after recursively replacing any variables or templates within the template.
 argSubstitution ($piece, $frame)
 Triple brace replacement -- used for template arguments.
 formatHeadings ($text, $isMain=true)
 This function accomplishes several tasks: 1) Auto-number headings if that option is enabled 2) Add an [edit] link to sections for users who have enabled the option and can edit the page 3) Add a Table of contents on the top for users who have enabled the option 4) Auto-anchor headings.
 pstPass2 ($text, $user)
 Pre-save transform helper function.
 getUserSig (&$user)
 Fetch the user's signature text, if any, and normalize to validated, ready-to-insert wikitext.
 attributeStripCallback (&$text, $frame=false)
 #@+ Callback from the Sanitizer for expanding items found in HTML attribute values, so they can be safely tested and escaped.
 extractSections ($text, $section, $mode, $newText='')
 #@-

Static Private Member Functions

 getRandomString ()
 Get a random string.
 externalTidy ($text)
 Spawn an external HTML tidy process and get corrected markup back from it.
 internalTidy ($text)
 Use the HTML tidy PECL extension to use the tidy library in-process, saving the overhead of spawning a new process.
static replaceUnusualEscapesCallback ($matches)
 Callback function used in replaceUnusualEscapes().

Private Attributes

 $mTagHooks
 #@+
 $mTransparentTagHooks
 $mFunctionHooks
 $mFunctionSynonyms
 $mVariables
 $mImageParams
 $mImageParamsMagicArray
 $mStripList
 $mMarkerIndex
 $mPreprocessor
 $mExtLinkBracketedRegex
 $mUrlProtocols
 $mDefaultStripList
 $mVarCache
 $mConf


Detailed Description

PHP Parser - Processes wiki markup (which uses a more user-friendly syntax, such as "[[link]]" for making links), and provides a one-way transformation of that wiki markup it into XHTML output / markup (which in turn the browser understands, and can display).

 There are five main entry points into the Parser class:
 parse()
   produces HTML output
 preSaveTransform().
   produces altered wiki markup.
 preprocess()
   removes HTML comments and expands templates
 cleanSig()
   Cleans a signature before saving it to preferences
 extractSections()
   Extracts sections from an article for section editing

 Globals used:
    objects:   $wgLang, $wgContLang

 NOT $wgArticle, $wgUser or $wgTitle. Keep them away!

 settings:
  $wgUseTex*, $wgUseDynamicDates*, $wgInterwikiMagic*,
  $wgNamespacesWithSubpages, $wgAllowExternalImages*,
  $wgLocaltimezone, $wgAllowSpecialInclusion*,
  $wgMaxArticleSize*

  * only within ParserOptions
 

Definition at line 46 of file Parser.php.


Constructor & Destructor Documentation

Parser::__construct ( conf = array()  ) 

#@-

Constructor

Reimplemented in Parser_LinkHooks.

Definition at line 125 of file Parser.php.

References $conf, wfDebug(), and wfUrlProtocols().

Parser::__destruct (  ) 

Reduce memory usage to reduce the impact of circular references.

Definition at line 154 of file Parser.php.

References $name.


Member Function Documentation

Parser::firstCallInit (  ) 

Do various kinds of initialisation on the first call of the parser.

Reimplemented in Parser_LinkHooks.

Definition at line 166 of file Parser.php.

References initialiseVariables(), CoreParserFunctions::register(), setHook(), wfProfileIn(), wfProfileOut(), and wfRunHooks().

Referenced by clearState().

Parser::clearState (  )  [private]

Parser::setOutputType ( ot  ) 

Definition at line 244 of file Parser.php.

References $ot.

Referenced by cleanSig(), extractSections(), parse(), preSaveTransform(), startExternalParse(), and testSrvus().

Parser::setTitle ( t  ) 

Set the context title.

Definition at line 257 of file Parser.php.

References $t, and Title::newFromText().

Referenced by cleanSig(), extractSections(), parse(), preSaveTransform(), and startExternalParse().

Parser::uniqPrefix (  ) 

Accessor for mUniqPrefix.

Definition at line 275 of file Parser.php.

Parser::parse ( text,
Title title,
ParserOptions options,
linestart = true,
clearState = true,
revid = null 
)

Convert wikitext to HTML Do not call this function recursively.

Parameters:
$text String: text we want to parse
$title A title object
$options ParserOptions
$linestart boolean
$clearState boolean
$revid Int: number to pass in {{REVISIONID}}
Returns:
ParserOutput a ParserOutput

Definition at line 300 of file Parser.php.

References $data, $fname, $matches, $output, $text, $wgContLang, clearState(), doBlockLevels(), http, internalParse(), is(), Sanitizer::normalizeCharReferences(), replaceLinkHolders(), setOutputType(), setTitle(), wfGetCaller(), wfProfileIn(), and wfRunHooks().

Parser::recursiveTagParse ( text  ) 

Recursive parser entry point that can be called from an extension tag hook.

Definition at line 440 of file Parser.php.

Referenced by renderImageGallery().

Parser::preprocess ( text,
title,
options,
revid = null 
)

Expand templates and variables in the text, producing valid, static wikitext.

Also removes comments.

Definition at line 453 of file Parser.php.

Referenced by transformMsg().

Parser::getRandomString (  )  [static, private]

Get a random string.

Definition at line 476 of file Parser.php.

Referenced by Parser_DiffTest::__construct().

& Parser::getTitle (  ) 

Definition at line 480 of file Parser.php.

Parser::getOptions (  ) 

Definition at line 481 of file Parser.php.

Parser::getRevisionId (  ) 

Definition at line 482 of file Parser.php.

Parser::getOutput (  ) 

Definition at line 483 of file Parser.php.

Parser::nextLinkID (  ) 

Definition at line 484 of file Parser.php.

Parser::getFunctionLang (  ) 

Definition at line 486 of file Parser.php.

Referenced by replaceExternalLinks().

Parser::getPreprocessor (  ) 

Get a preprocessor object.

Definition at line 500 of file Parser.php.

Referenced by cleanSig(), extractSections(), preprocessToDom(), and replaceVariables().

Parser::extractTagsAndParams ( elements,
text,
&$  matches,
uniq_prefix = '' 
) [static]

Replaces all occurrences of HTML-style comments and the given tags in the text with a random marker and returns the next text.

The output parameter $matches will be an associative array filled with data in the form: 'UNIQ-xxxxx' => array( 'element', 'tag content', array( 'param' => 'x' ), '<element param="x">tag content</element>' ) )

Parameters:
$elements list of element names. Comments are always extracted.
$text Source text string.
$uniq_prefix 

Definition at line 526 of file Parser.php.

Parser::getStripList (  ) 

Get a list of strippable XML-like elements.

Definition at line 591 of file Parser.php.

Parser::strip ( text,
state,
stripcomments = false,
dontstrip = array () 
)

Deprecated:
use replaceVariables

Definition at line 606 of file Parser.php.

Parser::unstrip ( text,
state 
) [private]

Restores pre, math, and other extensions removed by strip().

always call unstripNoWiki() after this one

Deprecated:
use $this->mStripState->unstrip()

Definition at line 617 of file Parser.php.

Parser::unstripNoWiki ( text,
state 
) [private]

Always call this after unstrip() to preserve the order.

Deprecated:
use $this->mStripState->unstrip()

Definition at line 627 of file Parser.php.

Parser::unstripForHTML ( text  ) 

Deprecated:
use $this->mStripState->unstripBoth()

Definition at line 634 of file Parser.php.

Parser::insertStripItem ( text  )  [private]

Add an item to the strip state Returns the unique tag which must be inserted into the stripped text The tag will be replaced with the original text in unstrip().

Definition at line 645 of file Parser.php.

Referenced by braceSubstitution().

Parser::tidy ( text  )  [static]

Interface with html tidy, used if $wgUseTidy = true.

If tidy isn't able to correct the markup, the original will be returned in all its glory with a warning comment appended.

Either the external tidy program or the in-process tidy extension will be used depending on availability. Override the default $wgTidyInternal setting to disable the internal if it's not working.

Parameters:
string $text Hideous HTML input
Returns:
string Corrected HTML output

Definition at line 666 of file Parser.php.

Referenced by internalTidy(), and ParserTest::tidy().

Parser::externalTidy ( text  )  [static, private]

Spawn an external HTML tidy process and get corrected markup back from it.

Definition at line 698 of file Parser.php.

References $text, wfGetNull(), wfProfileIn(), and wfProfileOut().

Parser::internalTidy ( text  )  [static, private]

Use the HTML tidy PECL extension to use the tidy library in-process, saving the overhead of spawning a new process.

'pear install tidy' should be able to compile the extension module.

Definition at line 749 of file Parser.php.

References $IP, $text, tidy(), wfProfileIn(), and wfProfileOut().

Parser::doTableStuff ( text  )  [private]

parse the wiki syntax used to render tables

Definition at line 778 of file Parser.php.

References $matches, $out, $text, count(), StringUtils::explode(), StringUtils::explodeMarkup(), Sanitizer::fixTagAttributes(), wfProfileIn(), and wfProfileOut().

Referenced by internalParse().

Parser::internalParse ( text  )  [private]

Parser::doMagicLinks ( text  )  [private]

Replace special strings like "ISBN xxx" and "RFC xxx" with magic external links.

DML

Definition at line 1027 of file Parser.php.

References $text, wfProfileIn(), and wfProfileOut().

Referenced by internalParse().

Parser::magicLinkCallback ( m  ) 

Definition at line 1047 of file Parser.php.

References $m, $num, $url, SpecialPage::getTitleFor(), makeFreeExternalLink(), and wfMsg().

Parser::makeFreeExternalLink ( url  )  [private]

Make a free external link, given a user-supplied URL.

Returns:
HTML

Definition at line 1097 of file Parser.php.

References $sep, $text, $url, Sanitizer::cleanUrl(), getExternalLinkAttribs(), maybeMakeExternalImage(), wfProfileIn(), and wfProfileOut().

Referenced by magicLinkCallback().

Parser::doHeadings ( text  )  [private]

Parse headers and return html.

Definition at line 1149 of file Parser.php.

References $i, $text, wfProfileIn(), and wfProfileOut().

Referenced by internalParse().

Parser::doAllQuotes ( text  )  [private]

Replace single quotes with HTML markup.

Returns:
string the altered text

Definition at line 1165 of file Parser.php.

References $text, doQuotes(), StringUtils::explode(), wfProfileIn(), and wfProfileOut().

Referenced by internalParse().

Parser::doQuotes ( text  ) 

Helper function for doAllQuotes().

Definition at line 1180 of file Parser.php.

References $i, $output, $text, and count().

Referenced by doAllQuotes(), and stripSectionName().

Parser::replaceExternalLinks ( text  )  [private]

Replace external links (REL).

Note: this is all very hackish and the order of execution matters a lot. Make sure to run maintenance/parserTests.php if you change this code.

Definition at line 1350 of file Parser.php.

References $i, $img, $text, $url, Sanitizer::cleanUrl(), count(), getExternalLinkAttribs(), getFunctionLang(), maybeMakeExternalImage(), Linker::splitTrail(), wfProfileIn(), wfProfileOut(), and wfUrlProtocols().

Referenced by internalParse(), and replaceInternalLinks2().

Parser::getExternalLinkAttribs (  ) 

Definition at line 1433 of file Parser.php.

References $ns.

Referenced by makeFreeExternalLink(), and replaceExternalLinks().

static Parser::replaceUnusualEscapes ( url  )  [static]

Replace unusual URL escape codes with their equivalent characters.

Parameters:
string 
Returns:
string
Todo:
This can merge genuinely required bits in the path or query string, breaking legit URLs. A proper fix would treat the various parts of the URL differently; as a workaround, just use the output for statistical records, not for actual linking/output.

Definition at line 1457 of file Parser.php.

References $url.

static Parser::replaceUnusualEscapesCallback ( matches  )  [static, private]

Callback function used in replaceUnusualEscapes().

Replaces unusual URL escape codes with their equivalent character

Definition at line 1468 of file Parser.php.

References $matches.

Parser::maybeMakeExternalImage ( url  )  [private]

make an image if it's allowed, either through the global option, through the exception, or through the on-wiki whitelist

Definition at line 1486 of file Parser.php.

References $text, $url, and wfMsgForContent().

Referenced by makeFreeExternalLink(), and replaceExternalLinks().

Parser::replaceInternalLinks ( s  )  [private]

Process [[ ]] wikilinks.

Returns:
processed text

Definition at line 1535 of file Parser.php.

References replaceInternalLinks2().

Referenced by internalParse(), and renderImageGallery().

Parser::replaceInternalLinks2 ( &$  s  )  [private]

Parser::makeLinkHolder ( &$  nt,
text = '',
query = '',
trail = '',
prefix = '' 
)

Make a link placeholder.

The text returned can be later resolved to a real link with replaceLinkHolders(). This is done for two reasons: firstly to avoid further parsing of interwiki links, and secondly to allow all existence checks and article length checks (for stub links) to be bundled into a single query.

Deprecated:

Definition at line 1869 of file Parser.php.

References $prefix, and $text.

Parser::makeKnownLinkHolder ( nt,
text = '',
query = '',
trail = '',
prefix = '' 
)

Render a forced-blue link inline; protect against double expansion of URLs if we're in a mode that prepends full URL prefixes to internal links.

Since this little disaster has to split off the trail text to avoid breaking URLs in the following text without breaking trails on the wiki links, it's been made into a horrible function.

Parameters:
Title $nt
string $text
string $query
string $trail
string $prefix
Returns:
string HTML-wikitext mix oh yuck

Definition at line 1887 of file Parser.php.

References $link, $prefix, $text, armorLinks(), and Linker::splitTrail().

Referenced by replaceInternalLinks2().

Parser::armorLinks ( text  ) 

Insert a NOPARSE hacky thing into any inline links in a chunk that's going to go through further parsing steps before inline URL expansion.

Not needed quite as much as it used to be since free links are a bit more sensible these days. But bracketed links are still an issue.

Parameters:
string more-or-less HTML
Returns:
string less-or-more HTML with NOPARSE bits

Definition at line 1904 of file Parser.php.

References $text, and wfUrlProtocols().

Referenced by makeKnownLinkHolder(), and replaceInternalLinks2().

Parser::areSubpagesAllowed ( &nbs