用PHP解析XML
XML数据提取可能是一项常见的任务,但是要直接使用此数据,您需要了解PHP如何解析XML。在PHP中解析XML涉及各种不同的功能,所有这些功能协同工作以从XML文档中提取数据。我将完成所有这些功能,并在最后将它们联系在一起。
xml_parser_create()
此函数用于创建解析器对象,该对象将在其余过程中使用。该对象用于存储数据和配置选项,并传递给所涉及的每个功能。
$xml_parser=xml_parser_create();
xml_set_element_handler()
接下来,我们需要设置将在脚本解析中使用的函数。该xml_set_handler()方法采用以下参数:
XML解析器参考:这是对使用xml_parser_create创建的解析器的参考function()。
起始元素:这是对函数的回调引用,当解析器运行时找到起始元素时将调用该函数。
Endelement:这是对函数的回调引用,当解析器运行时找到end元素时将调用该函数。
最后两个参数必须是具有特定占用空间的函数。这意味着它们需要具有正确的参数编号,但是您可以随心所欲地调用它。这是对该函数的调用示例xml_set_element_handler()。
xml_set_element_handler($xml_parser,"startElement","endElement");
的startElement()和endElement()功能将自动由XML解析器对象时,事情在运动设定被调用。
startElement()功能
在该函数的调用上方,xml_set_element_handler()您需要设置一个读取起始元素数据的方法。该方法必须具有以下参数:
解析器:这是在对xml_parser_create的调用中创建的xml解析器对象。
Name:开始元素的名称。
Attribs:这是属性的关联数组的开始元素包含。
因此,您的函数可能看起来像这样:
function startElement($xmlParser, $name, $attribs) { echo "Start: " . $name ."
"; }
所有这些操作将打印出元素的名称,但是您可以做更多的事情。例如,假设您的元素之一被称为
functionstartElement($xmlParser,$name,$attribs){
global$variable;
switch($name){
case'title':
$variable=$name;
break;
}
}
RememberthatyouwillneedtoputthisfunctiondeclarationBEFOREthecallforxml_set_element_handler()
,PHPneedstoknowaboutthismethodsothatitcanpointtheparsertowardsit.
endElement()
function
Thisfunctioniscalledwhentheparserencountersaxmlclosingelement.Inanoppositeoperationasbeforeyoumightneedtoclearthevariableyoustoredduringthestartelementfunction.AgainthisdeclerationMUSTbebeforethecallforxml_set_element_handler.Notethatifthetagisselfclosingthentherewillbenoendelement.Thefunctionmusthavethefollowingparameters.
- xml_parser:Theparsercreatedinthecalltoxml_parser_create.
- name:Thenameoftheelement.
Thefollowingcodewilljustprintofthenameoftheendelement,youcanusethisfunctiontooverrightanythingthatmayhavehappenedinthestartElementfunction.Forexample,youmayhavesetavalueinthestartElement()
tokeeptrackofthedepthoftheparserintotheXMLdocument,youcanusethismethodtoreduceit.Thismightbeimportantifthereismorethanoneelementwiththesamename,butinadifferentcontext.
functionendElement($parser,$name)
{
echo"End:".$name."
";
}
xml_set_character_data_handler()
Thenextfunctiontocallisxml_set_character_data_handler.Thistakestwoparameters:
- xml_parser:Thisisacallbackreferencetothexmlparserthatwascreatedinthecalltoxml_parser_create.
- characterData:Thisisacallbackreferencetothemethodthatwillbecalledwhencharacterdataisfound.
Thisfunctionworksinthesamewayasthexml_set_element_handler()
functioninthatitsimplysetsareferencetothefunctionthatwillbecalledwhencharacterdataisencountered.Thefunctioniscalledlikethis.
xml_set_character_data_handler($xml_parser,"characterData");
characterData()
function
ThecharacterData()
function,whichagainMUSTbeplacedbeforethecalltoxml_set_character_data_handler()
andmustalsohavethefollowingparameters.
- xml_parser:Thereferencetothexmlparsercreatedinthecalltoxml_parser_create.
- data:ThedataheldwithintheXMLelement.AnyCDATAtagshavebeenusedthentheparserwillreturneverythingbetweenthosetagssononeedtoworryaboutcuttingthemout.
Sowhentheparserobjectfindsadataobjectthismethodiscalled.Thefollowingfunctionwilljustprintoutthedata.
functioncharacterData($parser,$data){
echo"Data:".$data."
";
}
Onethingthatitisessentialthatyoulookoutforisthefunnythingthattheparserdoeswhenitencounderscertainconditions.Itwillstopparsingandcallthefunctionagain.Thisrepeatsuntilallofthedatahasbeenpassed.I'velisted(Ithink)alloftheconditionsbelow.
- TheparserrunsintoanEntityDeclaration,suchas&(&)or'(')
- Theparserfinishesparsinganentity.
- Theparserrunsintothenew-linecharacter(\n)
- Theparserrunsintoaseriesoftabcharacters(\t)
- Thecontentofthe$dataparameterismorethan1024(bytes).
Thebestwaytoexplainthisistouseanexample.Letssaythatyouhavethefollowingstringaspartofthedata.
sometext&
somemoretext'
lastbitoftext
Ifyouusedthepreviousexamplemethodofjustprintingouttheinformationthentheparserwillprintoutthefollowing:
Data:sometext
Data:&
Data:somemoretext
Data:'
Data:lastbitoftext
Sobesurethatwhenyoucallthemethodtomakesurethatallofthecharacterdataispassedthrough.OnethingyoucoulddoistohavethecharacterData()
functionaddthedatatoastring.ThestringisinitialisedwhenthestartElementfunctioniscalledandprintedoffwhentheendElementfunctioniscalled.
xml_parser_set_option()
Thismethodisoptionalandcanbeusedifyouwanttheparsertohaveacertainbehaviour.Forexample,toturnoffcasefoldingontheparserusethefollowingcode.
xml_parser_set_option($xml_parser,XML_OPTION_CASE_FOLDING,false);
Casefoldingisbasicallytheturningofcharacterstotheiruppercaseequivalent.However,inXMLalltagsmustbelowercasesoandforsomereasonthedefaultoftheparserisforthistobeon.Soifyoucreatew3cvalidXMLmakesurethatyouusethisfunctiontoturnoffcasefolding.Hereisalistoftheavailableoptionsforthisfunction.
- XML_OPTION_CASE_FOLDING:(integer)Controlswhethercase-foldingisenabledforthisXMLparser.Enabledbydefault.
- XML_OPTION_SKIP_TAGSTART:(integer)Specifyhowmanycharactersshouldbeskippedinthebeginningofatagname.
- XML_OPTION_SKIP_WHITE:(integer)Whethertoskipvaluesconsistingofwhitespacecharacters.
- XML_OPTION_TARGET_ENCODING:(string)SetswhichtargetencodingtouseinthisXMLparser.Bydefault,itissettothesameasthesourceencodingusedby
xml_parser_create()
.SupportedtargetencodingsareISO-8859-1,US-ASCIIandUTF-8.
xml_parse()
Thisfunctionisusedtoruntheparseroversomeinput.Ittakesthefollowingparameters:
- xml_parser:Thisisaxmlparserobjectcreatedinthe
xml_parser_create()
function. - data:Achunkofdatatoparse.Thiscanbereadfromafileorastream.
- end:(optional)Ifthisissettotruethenthisisthelastbitofdatafromthesourceandsothisisthelasttimethefunctionwillberun.
Asyoucanseethexml_parse()
functioncanberunoverandoveragainuntilallofthedatahasbeenreadfromthefile.
if(!($fp=fopen("an_xmfile.xml","r"))){
die("couldnotopenXMLinput");
}
while($data=fread($fp,4096)){
if (!xml_parse($xml_parser,$data,feof($fp))){
die(sprintf("XMLerror:%satline%d",xml_error_string(xml_get_error_code($xml_parser)),xml_get_current_line_number($xml_parser)));
}
}
xml_parser_free()
AsthenamesuggeststhisfunctioniscalledattheendoftheXMLparsingrun.ItbasicallyjustclearsupthememoryandthrowsawaytheXMLparserobjectcreatedatthestart.
Puttingthemalltogether
JustasanexampleIhaveputthecodetogetherintosomethingthatwillspitoutXMLintoformattedHTML,albeitalittleugly.ItisdesignedtoallowyoutoexpandupontocreateyourownXMLparsingscript.
//起始元素功能
functionstartElement($xmlParser,$name,$attribs){
echo"Start:".$name."
";
}
//结束元素功能
functionendElement($parser,$name){
echo"End:".$name."
";
}
functioncharacterData($parser,$data){
echo"Data:".$data."
";
}
$xml_parser=xml_parser_create();
xml_parser_set_option($xml_parser,XML_OPTION_CASE_FOLDING,false);
xml_set_element_handler($xml_parser,"startElement","endElement");
xml_set_character_data_handler($xml_parser,"characterData");
if(!($fp=fopen("an_xml_file.xml","r"))){
die("couldnotopenXMLinput");
}
while($data=fread($fp,4096)){
if(!xml_parse($xml_parser,$data,feof($fp))){
die(sprintf("XMLerror:%satline%d",xml_error_string(xml_get_error_code($xml_parser)),xml_get_current_line_number($xml_parser)));
}
}