跳至内容

Hive

HiveReader #

基类: BaseReader

从Hive读取文档。

这些文档随后可用于下游的Llama Index数据结构中。

参数:

名称 类型 描述 默认值
host

HiveServer2运行在哪个主机上

required
port

Hive Server运行的端口号。默认为10000。

None
auth

HiveServer2使用的hive.server2.authentication值。 默认为NONE

None
database Optional[str]

数据库名称

None
password Optional[str]

仅与auth='LDAP'或auth='CUSTOM'一起使用

None
Source code in llama-index-integrations/readers/llama-index-readers-hive/llama_index/readers/hive/base.py
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
class HiveReader(BaseReader):
    """
    Read documents from a Hive.

    These documents can then be used in a downstream Llama Index data structure.

    Args:
        host : What host HiveServer2 runs on
        port : The port Hive Server runs on. Defaults to 10000.
        auth : The value of hive.server2.authentication used by HiveServer2.
               Defaults to ``NONE``
        database: the database name
        password: Use with auth='LDAP' or auth='CUSTOM' only

    """

    def __init__(
        self,
        host: str,
        port: Optional[int] = None,
        database: Optional[str] = None,
        username: Optional[str] = None,
        password: Optional[str] = None,
        auth: Optional[str] = None,
    ):
        """Initialize with parameters."""
        try:
            from pyhive import hive
        except ImportError:
            raise ImportError(
                "`hive` package not found, please run `pip install pyhive`"
            )

        self.con = hive.Connection(
            host=host,
            port=port,
            username=username,
            database=database,
            auth=auth,
            password=password,
        )

    def load_data(self, query: str) -> List[Document]:
        """
        Read data from the Hive.

        Args:
            query (str): The query used to query data from Hive
        Returns:
            List[Document]: A list of documents.

        """
        try:
            cursor = self.con.cursor().execute(query)
            cursor.execute(query)
            rows = cursor.fetchall()
        except Exception:
            raise Exception(
                "Throws Exception in execution, please check your connection params and query "
            )

        documents = []
        for row in rows:
            documents = Document(text=row)
        return documents

加载数据 #

load_data(query: str) -> List[Document]

从Hive读取数据。

参数:

名称 类型 描述 默认值
query str

用于从Hive查询数据的查询语句

required

返回: List[Document]: 文档列表。

Source code in llama-index-integrations/readers/llama-index-readers-hive/llama_index/readers/hive/base.py
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
def load_data(self, query: str) -> List[Document]:
    """
    Read data from the Hive.

    Args:
        query (str): The query used to query data from Hive
    Returns:
        List[Document]: A list of documents.

    """
    try:
        cursor = self.con.cursor().execute(query)
        cursor.execute(query)
        rows = cursor.fetchall()
    except Exception:
        raise Exception(
            "Throws Exception in execution, please check your connection params and query "
        )

    documents = []
    for row in rows:
        documents = Document(text=row)
    return documents
优云智算