剖析 XML 資料

可擴充標記語言 (XML) 是一套編碼規則，用於建立機器可讀取的文件格式，它也是在網際網路上分享資料時常用的一種格式。

如果網站經常更新內容 (例如新聞網站或網誌)，通常就會提供 XML 動態饋給，如此一來，外部程式就能掌握內容變更。上傳及剖析 XML 資料是網路連線應用程式的常見工作。本主題將說明如何剖析 XML 文件並使用其資料。

如要進一步瞭解如何在 Android 應用程式中建立網頁內容，請參閱「網頁內容」。

選擇剖析器

建議您使用 XmlPullParser，這是一種在 Android 上有效率且可維護的 XML 剖析方法。Android 有兩種實作此介面的方法：

KXmlParser，使用 XmlPullParserFactory.newPullParser()
ExpatPullParser，使用 Xml.newPullParser()

兩者皆可使用。本節中的範例使用 Xml.newPullParser() 和 ExpatPullParser。

分析動態饋給

剖析動態饋給的第一步，就是決定要查看哪些欄位。剖析器會擷取這些欄位的資料，並忽略其餘的部分。

請查看以下範例應用程式中的剖析動態饋給摘錄。每篇張貼至 StackOverflow.com 的貼文都會以 entry 標記的形式顯示在動態饋給中，其中包含多個巢狀標記：

<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:creativeCommons="http://backend.userland.com/creativeCommonsRssModule" ...">
<title type="text">newest questions tagged android - Stack Overflow</title>
...
    <entry>
    ...
    </entry>
    <entry>
        <id>http://stackoverflow.com/q/9439999</id>
        <re:rank scheme="http://stackoverflow.com">0</re:rank>
        <title type="text">Where is my data file?</title>
        <category scheme="http://stackoverflow.com/feeds/tag?tagnames=android&sort=newest/tags" term="android"/>
        <category scheme="http://stackoverflow.com/feeds/tag?tagnames=android&sort=newest/tags" term="file"/>
        <author>
            <name>cliff2310</name>
            <uri>http://stackoverflow.com/users/1128925</uri>
        </author>
        <link rel="alternate" href="http://stackoverflow.com/questions/9439999/where-is-my-data-file" />
        <published>2012-02-25T00:30:54Z</published>
        <updated>2012-02-25T00:30:54Z</updated>
        <summary type="html">
            <p>I have an Application that requires a data file...</p>

        </summary>
    </entry>
    <entry>
    ...
    </entry>
...
</feed>

範例應用程式會擷取 entry 標記及其巢狀標記 title、link 和 summary 的資料。

將剖析器例項化

剖析動態饋給的下一步，是將剖析器執行個體化並啟動剖析程序。這個程式碼片段會將剖析器初始化，使其不處理命名空間，並使用已提供的 InputStream 做為輸入內容。系統會透過 nextTag() 呼叫啟動剖析程序，並叫用 readFeed() 方法，以擷取並處理應用程式感興趣的資料：

Kotlin

// We don't use namespaces.
private val ns: String? = null

class StackOverflowXmlParser {

    @Throws(XmlPullParserException::class, IOException::class)
    fun parse(inputStream: InputStream): List<*> {
        inputStream.use { inputStream ->
            val parser: XmlPullParser = Xml.newPullParser()
            parser.setFeature(XmlPullParser.FEATURE_PROCESS_NAMESPACES, false)
            parser.setInput(inputStream, null)
            parser.nextTag()
            return readFeed(parser)
        }
    }
 ...
}

Java

public class StackOverflowXmlParser {
    // We don't use namespaces.
    private static final String ns = null;

    public List parse(InputStream in) throws XmlPullParserException, IOException {
        try {
            XmlPullParser parser = Xml.newPullParser();
            parser.setFeature(XmlPullParser.FEATURE_PROCESS_NAMESPACES, false);
            parser.setInput(in, null);
            parser.nextTag();
            return readFeed(parser);
        } finally {
            in.close();
        }
    }
 ...
}

讀取動態消息

實際負責處理動態饋給資料的是 readFeed() 方法。這個方法會找出標記為「entry」的元素，做為遞迴處理動態饋給的起點。如果不是 entry 標記，則可略過。整個動態饋給都以週期性方式處理後，readFeed() 會傳回 List，其中包含從動態饋給中擷取的項目 (包括巢狀資料成員)。接著，剖析器會傳回這個 List。

Kotlin

@Throws(XmlPullParserException::class, IOException::class)
private fun readFeed(parser: XmlPullParser): List<Entry> {
    val entries = mutableListOf<Entry>()

    parser.require(XmlPullParser.START_TAG, ns, "feed")
    while (parser.next() != XmlPullParser.END_TAG) {
        if (parser.eventType != XmlPullParser.START_TAG) {
            continue
        }
        // Starts by looking for the entry tag.
        if (parser.name == "entry") {
            entries.add(readEntry(parser))
        } else {
            skip(parser)
        }
    }
    return entries
}

Java

private List readFeed(XmlPullParser parser) throws XmlPullParserException, IOException {
    List entries = new ArrayList();

    parser.require(XmlPullParser.START_TAG, ns, "feed");
    while (parser.next() != XmlPullParser.END_TAG) {
        if (parser.getEventType() != XmlPullParser.START_TAG) {
            continue;
        }
        String name = parser.getName();
        // Starts by looking for the entry tag.
        if (name.equals("entry")) {
            entries.add(readEntry(parser));
        } else {
            skip(parser);
        }
    }
    return entries;
}

剖析 XML

剖析 XML 動態饋給的步驟如下：

按照「分析動態饋給」一文的說明，找出您要納入應用程式的標記。這個範例會擷取 entry 標記及其巢狀標記 title、link 和 summary 的資料。
建立以下方法：
- 您要納入的每個標記都有一種「讀取」方法，例如 readEntry() 和 readTitle()。剖析器會讀取輸入資料流中的標記。當遇到本範例中名為 entry、title、link 或 summary 的標記時，它會為該標記呼叫適當方法，其他的標記則會略過。
- 為其他類型的各標記擷取資料的方法，以及讓剖析器對下一個標記進行剖析的方法。在這個例子中，相關方法如下：
  - 剖析器會針對 title 和 summary 標記呼叫 readText()。這個方法會呼叫 parser.getText() 來擷取這些標記的資料。
  - 如果是 link 標記，剖析器會先確定連結是否為其有興趣的類型，再擷取該連結的資料。接著，剖析器會使用 parser.getAttributeValue() 擷取連結的值。
  - 針對 entry 標記，剖析器會呼叫 readEntry()。這個方法會剖析該項目的巢狀標記並傳回 Entry 含有資料成員 title、link 和 summary。
- 遞迴的 skip() 輔助方法。如要進一步瞭解這個主題，請參閱「跳過您不需要的標記」。

這段程式碼顯示剖析器如何剖析項目、標題、連結和摘要。

Kotlin

data class Entry(val title: String?, val summary: String?, val link: String?)

// Parses the contents of an entry. If it encounters a title, summary, or link tag, hands them off
// to their respective "read" methods for processing. Otherwise, skips the tag.
@Throws(XmlPullParserException::class, IOException::class)
private fun readEntry(parser: XmlPullParser): Entry {
    parser.require(XmlPullParser.START_TAG, ns, "entry")
    var title: String? = null
    var summary: String? = null
    var link: String? = null
    while (parser.next() != XmlPullParser.END_TAG) {
        if (parser.eventType != XmlPullParser.START_TAG) {
            continue
        }
        when (parser.name) {
            "title" -> title = readTitle(parser)
            "summary" -> summary = readSummary(parser)
            "link" -> link = readLink(parser)
            else -> skip(parser)
        }
    }
    return Entry(title, summary, link)
}

// Processes title tags in the feed.
@Throws(IOException::class, XmlPullParserException::class)
private fun readTitle(parser: XmlPullParser): String {
    parser.require(XmlPullParser.START_TAG, ns, "title")
    val title = readText(parser)
    parser.require(XmlPullParser.END_TAG, ns, "title")
    return title
}

// Processes link tags in the feed.
@Throws(IOException::class, XmlPullParserException::class)
private fun readLink(parser: XmlPullParser): String {
    var link = ""
    parser.require(XmlPullParser.START_TAG, ns, "link")
    val tag = parser.name
    val relType = parser.getAttributeValue(null, "rel")
    if (tag == "link") {
        if (relType == "alternate") {
            link = parser.getAttributeValue(null, "href")
            parser.nextTag()
        }
    }
    parser.require(XmlPullParser.END_TAG, ns, "link")
    return link
}

// Processes summary tags in the feed.
@Throws(IOException::class, XmlPullParserException::class)
private fun readSummary(parser: XmlPullParser): String {
    parser.require(XmlPullParser.START_TAG, ns, "summary")
    val summary = readText(parser)
    parser.require(XmlPullParser.END_TAG, ns, "summary")
    return summary
}

// For the tags title and summary, extracts their text values.
@Throws(IOException::class, XmlPullParserException::class)
private fun readText(parser: XmlPullParser): String {
    var result = ""
    if (parser.next() == XmlPullParser.TEXT) {
        result = parser.text
        parser.nextTag()
    }
    return result
}
...

Java

public static class Entry {
    public final String title;
    public final String link;
    public final String summary;

    private Entry(String title, String summary, String link) {
        this.title = title;
        this.summary = summary;
        this.link = link;
    }
}

// Parses the contents of an entry. If it encounters a title, summary, or link tag, hands them off
// to their respective "read" methods for processing. Otherwise, skips the tag.
private Entry readEntry(XmlPullParser parser) throws XmlPullParserException, IOException {
    parser.require(XmlPullParser.START_TAG, ns, "entry");
    String title = null;
    String summary = null;
    String link = null;
    while (parser.next() != XmlPullParser.END_TAG) {
        if (parser.getEventType() != XmlPullParser.START_TAG) {
            continue;
        }
        String name = parser.getName();
        if (name.equals("title")) {
            title = readTitle(parser);
        } else if (name.equals("summary")) {
            summary = readSummary(parser);
        } else if (name.equals("link")) {
            link = readLink(parser);
        } else {
            skip(parser);
        }
    }
    return new Entry(title, summary, link);
}

// Processes title tags in the feed.
private String readTitle(XmlPullParser parser) throws IOException, XmlPullParserException {
    parser.require(XmlPullParser.START_TAG, ns, "title");
    String title = readText(parser);
    parser.require(XmlPullParser.END_TAG, ns, "title");
    return title;
}

// Processes link tags in the feed.
private String readLink(XmlPullParser parser) throws IOException, XmlPullParserException {
    String link = "";
    parser.require(XmlPullParser.START_TAG, ns, "link");
    String tag = parser.getName();
    String relType = parser.getAttributeValue(null, "rel");
    if (tag.equals("link")) {
        if (relType.equals("alternate")){
            link = parser.getAttributeValue(null, "href");
            parser.nextTag();
        }
    }
    parser.require(XmlPullParser.END_TAG, ns, "link");
    return link;
}

// Processes summary tags in the feed.
private String readSummary(XmlPullParser parser) throws IOException, XmlPullParserException {
    parser.require(XmlPullParser.START_TAG, ns, "summary");
    String summary = readText(parser);
    parser.require(XmlPullParser.END_TAG, ns, "summary");
    return summary;
}

// For the tags title and summary, extracts their text values.
private String readText(XmlPullParser parser) throws IOException, XmlPullParserException {
    String result = "";
    if (parser.next() == XmlPullParser.TEXT) {
        result = parser.getText();
        parser.nextTag();
    }
    return result;
}
  ...
}

跳過您不需要的標記

剖析器必須略過不感興趣的標記。以下是剖析器的 skip() 方法：

Kotlin

@Throws(XmlPullParserException::class, IOException::class)
private fun skip(parser: XmlPullParser) {
    if (parser.eventType != XmlPullParser.START_TAG) {
        throw IllegalStateException()
    }
    var depth = 1
    while (depth != 0) {
        when (parser.next()) {
            XmlPullParser.END_TAG -> depth--
            XmlPullParser.START_TAG -> depth++
        }
    }
}

Java

private void skip(XmlPullParser parser) throws XmlPullParserException, IOException {
    if (parser.getEventType() != XmlPullParser.START_TAG) {
        throw new IllegalStateException();
    }
    int depth = 1;
    while (depth != 0) {
        switch (parser.next()) {
        case XmlPullParser.END_TAG:
            depth--;
            break;
        case XmlPullParser.START_TAG:
            depth++;
            break;
        }
    }
 }

這類廣告運作方式如下：

如果目前事件不是 START_TAG，就會擲回例外狀況。
這個事件會使用 START_TAG，以及直到比對至 END_TAG 的所有事件。
它會追蹤巢狀深度，確保其在正確的 END_TAG 時停止，而不是在遇到原始 START_TAG 之後偵測到的第一個標記時停止。

因此，如果目前元素具有巢狀元素，則直到剖析器使用在原始 START_TAG 和其相符 END_TAG 之間的所有事件前，depth 的值不為 0，。例如，請思考剖析器如何略過具有 2 個巢狀元素 <name> 和 <uri> 的 <author> 元素：

第一次通過 while 迴圈時，剖析器在 <author> 後遇到的下一個標記是 <name> 的 START_TAG。depth 的值會遞增至 2。
第二次通過 while 迴圈時，剖析器遇到的下一個標記是 END_TAG </name>。depth 的值會減少至 1。
第三次通過 while 迴圈時，剖析器遇到的下一個標記是 START_TAG <uri>。depth 的值會遞增至 2。
第四次通過 while 迴圈時，剖析器遇到的下一個標記是 END_TAG </uri>。depth 的值會減少至 1。
第五次 (最後一次) 通過 while 迴圈，剖析器遇到的下一個標記是 END_TAG </author>。depth 的值會減少至 0，表示已成功略過 <author> 元素。

使用 XML 資料

範例應用程式會以非同步方式擷取和剖析 XML 資訊動態饋給。這麼做就不會在主 UI 執行緒上進行處理程序。處理程序完成後，應用程式會在自己的主要活動中更新 UI (NetworkActivity)。

在以下摘錄的程式碼中，loadPage() 方法會執行以下操作：

使用 XML 動態饋給的網址來初始化字串變數。
如果使用者的設定和網路連線允許，請叫用 downloadXml(url) 方法。這個方法會下載並剖析動態饋給，並傳回要在使用者介面中顯示的結果。

Kotlin

class NetworkActivity : Activity() {

    companion object {

        const val WIFI = "Wi-Fi"
        const val ANY = "Any"
        const val SO_URL = "http://stackoverflow.com/feeds/tag?tagnames=android&sort=newest"
        // Whether there is a Wi-Fi connection.
        private var wifiConnected = false
        // Whether there is a mobile connection.
        private var mobileConnected = false

        // Whether the display should be refreshed.
        var refreshDisplay = true
        // The user's current network preference setting.
        var sPref: String? = null
    }
    ...
    // Asynchronously downloads the XML feed from stackoverflow.com.
    fun loadPage() {

        if (sPref.equals(ANY) && (wifiConnected || mobileConnected)) {
            downloadXml(SO_URL)
        } else if (sPref.equals(WIFI) && wifiConnected) {
            downloadXml(SO_URL)
        } else {
            // Show error.
        }
    }
    ...
}

Java

public class NetworkActivity extends Activity {
    public static final String WIFI = "Wi-Fi";
    public static final String ANY = "Any";
    private static final String URL = "http://stackoverflow.com/feeds/tag?tagnames=android&sort=newest";

    // Whether there is a Wi-Fi connection.
    private static boolean wifiConnected = false;
    // Whether there is a mobile connection.
    private static boolean mobileConnected = false;
    // Whether the display should be refreshed.
    public static boolean refreshDisplay = true;
    public static String sPref = null;
    ...
    // Asynchronously downloads the XML feed from stackoverflow.com.
    public void loadPage() {

        if((sPref.equals(ANY)) && (wifiConnected || mobileConnected)) {
            downloadXml(URL);
        }
        else if ((sPref.equals(WIFI)) && (wifiConnected)) {
            downloadXml(URL);
        } else {
            // Show error.
        }
    }

downloadXml 方法會在 Kotlin 中呼叫下列方法：

lifecycleScope.launch(Dispatchers.IO) 會使用 Kotlin 協同程式在 IO 執行緒上啟動 loadXmlFromNetwork() 方法，並將動態饋給網址視為參數傳遞。loadXmlFromNetwork() 方法會擷取及處理動態饋給。作業完畢後，它會傳回結果字串。
withContext(Dispatchers.Main) 會使用 Kotlin 協同程式返回主執行緒，接收傳回的字串並將其顯示在使用者介面中。

在 Java 程式設計語言中，處理程序如下：

Executor 會在背景執行緒上執行 loadXmlFromNetwork() 方法，並將動態饋給網址視為參數傳遞。loadXmlFromNetwork() 方法會擷取及處理資訊提供。作業完畢後，它會傳回結果字串。
Handler 會呼叫 post 以返回主執行緒，接收傳回的字串並將其顯示在使用者介面中。

Kotlin

// Implementation of Kotlin coroutines used to download XML feed from stackoverflow.com.
private fun downloadXml(vararg urls: String) {
    var result: String? = null
    lifecycleScope.launch(Dispatchers.IO) {
        result = try {
            loadXmlFromNetwork(urls[0])
        } catch (e: IOException) {
            resources.getString(R.string.connection_error)
        } catch (e: XmlPullParserException) {
            resources.getString(R.string.xml_error)
        }
        withContext(Dispatchers.Main) {
            setContentView(R.layout.main)
            // Displays the HTML string in the UI via a WebView.
            findViewById<WebView>(R.id.webview)?.apply {
                loadData(result?: "", "text/html", null)
            }
        }
    }
}

Java

// Implementation of Executor and Handler used to download XML feed asynchronously from stackoverflow.com.
private void downloadXml(String... urls) {
    ExecutorService executor = Executors.newSingleThreadExecutor();
    Handler handler = new Handler(Looper.getMainLooper());
    executor.execute(() -> {
        String result;
            try {
                result = loadXmlFromNetwork(urls[0]);
            } catch (IOException e) {
                result = getResources().getString(R.string.connection_error);
            } catch (XmlPullParserException e) {
                result = getResources().getString(R.string.xml_error);
            }
        String finalResult = result;
        handler.post(() -> {
            setContentView(R.layout.main);
            // Displays the HTML string in the UI via a WebView.
            WebView myWebView = (WebView) findViewById(R.id.webview);
            myWebView.loadData(finalResult, "text/html", null);
        });
    });
}

下一個程式碼片段顯示了從 downloadXml 叫用的 loadXmlFromNetwork() 方法。這個檔案可執行以下作業：

將 StackOverflowXmlParser 例項化。它也會為 Entry 物件 (entries) 的 List 和 title、url 和 summary 建立變數，並為這些欄位保留從 XML 動態饋給擷取的值。
呼叫 downloadUrl() 以擷取動態饋給並將其做為 InputStream 傳回。
使用 StackOverflowXmlParser 剖析 InputStream。 StackOverflowXmlParser 會將來自動態饋給的資料填入 entries 的 List。
處理 entries List，並將動態饋給資料和 HTML 標記合併。
傳回在主要活動 UI 中顯示的 HTML 字串。

Kotlin

// Uploads XML from stackoverflow.com, parses it, and combines it with
// HTML markup. Returns HTML string.
@Throws(XmlPullParserException::class, IOException::class)
private fun loadXmlFromNetwork(urlString: String): String {
    // Checks whether the user set the preference to include summary text.
    val pref: Boolean = PreferenceManager.getDefaultSharedPreferences(this)?.run {
        getBoolean("summaryPref", false)
    } ?: false

    val entries: List<Entry> = downloadUrl(urlString)?.use { stream ->
        // Instantiates the parser.
        StackOverflowXmlParser().parse(stream)
    } ?: emptyList()

    return StringBuilder().apply {
        append("<h3>${resources.getString(R.string.page_title)}</h3>")
        append("<em>${resources.getString(R.string.updated)} ")
        append("${formatter.format(rightNow.time)}</em>")
        // StackOverflowXmlParser returns a List (called "entries") of Entry objects.
        // Each Entry object represents a single post in the XML feed.
        // This section processes the entries list to combine each entry with HTML markup.
        // Each entry is displayed in the UI as a link that optionally includes
        // a text summary.
        entries.forEach { entry ->
            append("<p><a href='")
            append(entry.link)
            append("'>" + entry.title + "</a></p>")
            // If the user set the preference to include summary text,
            // adds it to the display.
            if (pref) {
                append(entry.summary)
            }
        }
    }.toString()
}

// Given a string representation of a URL, sets up a connection and gets
// an input stream.
@Throws(IOException::class)
private fun downloadUrl(urlString: String): InputStream? {
    val url = URL(urlString)
    return (url.openConnection() as? HttpURLConnection)?.run {
        readTimeout = 10000
        connectTimeout = 15000
        requestMethod = "GET"
        doInput = true
        // Starts the query.
        connect()
        inputStream
    }
}

Java

// Uploads XML from stackoverflow.com, parses it, and combines it with
// HTML markup. Returns HTML string.
private String loadXmlFromNetwork(String urlString) throws XmlPullParserException, IOException {
    InputStream stream = null;
    // Instantiates the parser.
    StackOverflowXmlParser stackOverflowXmlParser = new StackOverflowXmlParser();
    List<Entry> entries = null;
    String title = null;
    String url = null;
    String summary = null;
    Calendar rightNow = Calendar.getInstance();
    DateFormat formatter = new SimpleDateFormat("MMM dd h:mmaa");

    // Checks whether the user set the preference to include summary text.
    SharedPreferences sharedPrefs = PreferenceManager.getDefaultSharedPreferences(this);
    boolean pref = sharedPrefs.getBoolean("summaryPref", false);

    StringBuilder htmlString = new StringBuilder();
    htmlString.append("<h3>" + getResources().getString(R.string.page_title) + "</h3>");
    htmlString.append("<em>" + getResources().getString(R.string.updated) + " " +
            formatter.format(rightNow.getTime()) + "</em>");

    try {
        stream = downloadUrl(urlString);
        entries = stackOverflowXmlParser.parse(stream);
    // Makes sure that the InputStream is closed after the app is
    // finished using it.
    } finally {
        if (stream != null) {
            stream.close();
        }
     }

    // StackOverflowXmlParser returns a List (called "entries") of Entry objects.
    // Each Entry object represents a single post in the XML feed.
    // This section processes the entries list to combine each entry with HTML markup.
    // Each entry is displayed in the UI as a link that optionally includes
    // a text summary.
    for (Entry entry : entries) {
        htmlString.append("<p><a href='");
        htmlString.append(entry.link);
        htmlString.append("'>" + entry.title + "</a></p>");
        // If the user set the preference to include summary text,
        // adds it to the display.
        if (pref) {
            htmlString.append(entry.summary);
        }
    }
    return htmlString.toString();
}

// Given a string representation of a URL, sets up a connection and gets
// an input stream.
private InputStream downloadUrl(String urlString) throws IOException {
    URL url = new URL(urlString);
    HttpURLConnection conn = (HttpURLConnection) url.openConnection();
    conn.setReadTimeout(10000 /* milliseconds */);
    conn.setConnectTimeout(15000 /* milliseconds */);
    conn.setRequestMethod("GET");
    conn.setDoInput(true);
    // Starts the query.
    conn.connect();
    return conn.getInputStream();
}

剖析 XML 資料 透過集合功能整理內容 你可以依據偏好儲存及分類內容。

選擇剖析器

分析動態饋給

將剖析器例項化

Kotlin

Java

讀取動態消息

Kotlin

Java

剖析 XML

Kotlin

Java

跳過您不需要的標記

Kotlin

Java

使用 XML 資料

Kotlin

Java

Kotlin

Java

Kotlin

Java

剖析 XML 資料