# 08.5.2 逐词读取文本文件

本节中展示的技术将通过`byWord.go`文件演示，它由四部分组成。正如你在`Go`代码中看到的，分隔一行中的单词可能比较棘手。程序的第一部分如下：

```go
package main

import (
    "bufio"
    "flag"
    "fmt"
    "io"
    "os"
    "regexp"
)
```

`byWord.go`的第二部分代码如下：

```go
func wordByWord(file string) error {
    var err error
    f, err := os.Open(file)
    if err != nil {
        return err
    }
    defer f.Close()

    r := bufio.NewReader(f)
    for {
        line, err := r.ReadString('\n')
        if err == io.EOF {
            break
        } else if err != nil {
            fmt.Printf("error reading file %s", err)
            return err
        }
```

`wordByWord()`函数的这部分代码和`byLine.go`程序的`lineByLine()`函数一样。

`byWord.go`第三部分代码如下：

```go
        r := regexp.MustCompile("[^\\s]+")
          words := r.FindAllString(line, -1)
          for i := 0; i < len(words); i++ {
              fmt.Printf(words[i])
          }
    }
    return nil
}
```

`wordByWord()`函数的剩余代码是全新的，并使用正则表达式对输入的每行进行单词分割。正则表达式`regexp.MustCompile("[^\\s]+")`使用空格分割单词。

`byWord.go`的最后一部分代码如下：

```go
func main() {
    flag.Parse()
    if len(flag.Args()) == 0 {
        fmt.Printf("usage: byWord <file1> [<file2> ...]\n")
        return
    }

    for _, file := range flag.Args() {
        err := wordByWord(file)
        if err != nil {
            fmt.Println(err)
        }
    }
}
```

执行`byWord.go`会产生如下的输出：

```
$ go run byWord.go /tmp/adobegc.log
01/08/18
20:25:09:669
|
[INFO]
```

可以使用`wc(1)`验证`byWord.go`的正确性：

```
$ go run byWord.go /tmp/adobegc.log | wc
    91591     91591     559005
$ wc /tmp/adobegc.log
    4831      91591     583454     /tmp/adobegc.log
```

如你所见，`wc(1)`计算所得的单词数和`byWord.go`一致。


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://wskdsgcf.gitbook.io/mastering-go-zh-cn/08.0/08.5/08.5.2.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
