• 跳至主要導覽
  • 跳至主要內容
  • 跳至主要資訊欄
Office 指南

Office 指南

辦公室工作實用教學

  • Excel
  • Word
  • PowerPoint
  • Windows
  • PowerShell
  • R

Python 讀取 SAS7BDAT 格式 SAS 檔案教學與範例

介紹如何在 Python 中使用 sas7bdat 模組讀取 SAS7BDAT 格式的 SAS 檔案。

相關文章:
R 讀取 SAS7BDAT 格式 SAS 檔案教學與範例

安裝 sas7bdat 模組

Python 的 sas7bdat 模組可以透過 pip 指令來安裝:

# 安裝 sas7bdat 模組
pip install sas7bdat

讀取 SAS7BDAT 格式 SAS 檔案

這裡我們以 airline.sas7bdat 這個檔案來示範如何使用 Python 的 sas7bdat 模組來讀取 SAS7BDAT 格式的 SAS 檔案。

首先以 sas7bdat 模組的 SAS7BDAT 函數開啟 SAS7BDAT 檔案:

# 引入 sas7bdat 模組的 SAS7BDAT 函數
from sas7bdat import SAS7BDAT

# 開啟 SAS7BDAT 格式的 SAS 檔案
reader = SAS7BDAT('airline.sas7bdat')

reader.header 中包含了詳細的標頭資訊:

# 標頭資訊
print(reader.header)
Header:
        col_count_p1: 6
        col_count_p2: 0
        column_count: 6
        compression: None
        creator: Written by SAS
        creator_proc: None
        date_created: 2008-05-13 15:25:11
        date_modified: 2008-05-13 15:25:11
        endianess: little
        file_type: DATA
        filename: airline.sas7bdat
        header_length: 1024
        lcp: 0
        lcs: 16
        mix_page_row_count: 32
        name: AIRLINE
        os_name:
        os_type:
        page_count: 1
        page_length: 4096
        platform: windows
        row_count: 32
        row_length: 44
        sas_release: 9.0000M0
        server_type: WIN
        u64: False

Contents of dataset "AIRLINE":
Num Name    Type   Length Format Label
--- ------- ------ ------ ------ ------------------
  1 YEAR    number      4        year
  2 Y       number      8        level of output
  3 W       number      8        wage rate
  4 R       number      8        interest rate
  5 L       number      8        labor input
  6 K       number      8        capital input

透過 reader.columns 可以取得 SAS 變數的屬性(variable attributes):

# 取得變數屬性(Variable Attributes)
for c in reader.columns:
    print("col_id:", c.col_id)
    print("  name:", c.name.decode(encoding = "utf-8"))
    print("  label:", c.label.decode(encoding = "utf-8"))
    print("  format:", c.format)
    print("  type:", c.type)
    print("  length:", c.length)
col_id: 0
  name: YEAR
  label: year
  format:
  type: number
  length: 4
col_id: 1
  name: Y
  label: level of output
  format:
  type: number
  length: 8
col_id: 2
  name: W
  label: wage rate
  format:
  type: number
  length: 8
col_id: 3
  name: R
  label: interest rate
  format:
  type: number
  length: 8
col_id: 4
  name: L
  label: labor input
  format:
  type: number
  length: 8
col_id: 5
  name: K
  label: capital input
  format:
  type: number
  length: 8

實際的資料內容則可透過 reader 取得:

# 讀取資料
for r in reader:
    print(r)
['YEAR', 'Y', 'W', 'R', 'L', 'K']
[1948.0, 1.2139999866485596, 0.24300000071525574, 0.1454000025987625, 1.4149999618530273, 0.6119999885559082]
[1949.0, 1.3539999723434448, 0.25999999046325684, 0.21809999644756317, 1.3839999437332153, 0.5590000152587891]
[1950.0, 1.569000005722046, 0.27799999713897705, 0.3156999945640564, 1.3880000114440918, 0.5730000138282776]
[1951.0, 1.9479999542236328, 0.296999990940094, 0.39399999380111694, 1.5499999523162842, 0.5640000104904175]
[1952.0, 2.265000104904175, 0.3100000023841858, 0.35589998960494995, 1.8020000457763672, 0.5740000009536743]
[1953.0, 2.7309999465942383, 0.32199999690055847, 0.35929998755455017, 1.9259999990463257, 0.7110000252723694]
[1954.0, 3.0250000953674316, 0.33500000834465027, 0.4025000035762787, 1.9639999866485596, 0.7760000228881836]
[1955.0, 3.562000036239624, 0.3499999940395355, 0.3961000144481659, 2.115999937057495, 0.8270000219345093]
[1956.0, 3.9790000915527344, 0.3610000014305115, 0.3822000026702881, 2.434999942779541, 0.800000011920929]
[1957.0, 4.420000076293945, 0.3790000081062317, 0.304500013589859, 2.7070000171661377, 0.9210000038146973]
[1958.0, 4.563000202178955, 0.39100000262260437, 0.32839998602867126, 2.7060000896453857, 1.0670000314712524]
[1959.0, 5.385000228881836, 0.4259999990463257, 0.3856000006198883, 2.8459999561309814, 1.0829999446868896]
[1960.0, 5.553999900817871, 0.4410000145435333, 0.31929999589920044, 3.0889999866485596, 1.4809999465942383]
[1961.0, 5.465000152587891, 0.46000000834465027, 0.30790001153945923, 3.121999979019165, 1.7359999418258667]
[1962.0, 5.824999809265137, 0.48500001430511475, 0.3783000111579895, 3.184000015258789, 1.9259999990463257]
[1963.0, 6.875999927520752, 0.5059999823570251, 0.4180000126361847, 3.263000011444092, 2.0409998893737793]
[1964.0, 7.822999954223633, 0.5379999876022339, 0.5163000226020813, 3.4119999408721924, 1.996999979019165]
[1965.0, 9.119999885559082, 0.5640000104904175, 0.5878999829292297, 3.622999906539917, 2.256999969482422]
[1966.0, 10.51200008392334, 0.5860000252723694, 0.536899983882904, 4.073999881744385, 2.742000102996826]
[1967.0, 13.020000457763672, 0.621999979019165, 0.44429999589920044, 4.710000038146973, 3.563999891281128]
[1968.0, 15.26099967956543, 0.6660000085830688, 0.3052000105381012, 5.2170000076293945, 4.767000198364258]
[1969.0, 16.312999725341797, 0.7310000061988831, 0.23319999873638153, 5.568999767303467, 6.511000156402588]
[1970.0, 16.00200080871582, 0.8309999704360962, 0.1882999986410141, 5.494999885559082, 7.626999855041504]
[1971.0, 15.87600040435791, 0.906000018119812, 0.20229999721050262, 5.334000110626221, 8.67300033569336]
[1972.0, 16.66200065612793, 1.0, 0.25060001015663147, 5.34499979019165, 8.331000328063965]
[1973.0, 17.013999938964844, 1.055999994277954, 0.2667999863624573, 5.6620001792907715, 8.557000160217285]
[1974.0, 19.30500030517578, 1.13100004196167, 0.266400009393692, 5.729000091552734, 9.508000373840332]
[1975.0, 18.72100067138672, 1.246999979019165, 0.23010000586509705, 5.7220001220703125, 9.062000274658203]
[1976.0, 19.25, 1.375, 0.3452000021934509, 5.76200008392334, 8.26200008392334]
[1977.0, 20.64699935913086, 1.5440000295639038, 0.45080000162124634, 5.876999855041504, 7.473999977111816]
[1978.0, 22.72599983215332, 1.7029999494552612, 0.5877000093460083, 6.107999801635742, 7.104000091552734]
[1979.0, 23.618999481201172, 1.7790000438690186, 0.534600019454956, 6.8520002365112305, 6.874000072479248]

若習慣以 Pandas 來處理資料的話,亦可直接將 SAS7BDAT 檔案的內容轉為 Pandas 的 DataFrame 資料格式:

# 將資料直接轉換為 Pandas DataFrame
df = reader.to_data_frame()
df
      YEAR          Y      W       R      L      K
0   1948.0   1.214000  0.243  0.1454  1.415  0.612
1   1949.0   1.354000  0.260  0.2181  1.384  0.559
2   1950.0   1.569000  0.278  0.3157  1.388  0.573
3   1951.0   1.948000  0.297  0.3940  1.550  0.564
4   1952.0   2.265000  0.310  0.3559  1.802  0.574
5   1953.0   2.731000  0.322  0.3593  1.926  0.711
6   1954.0   3.025000  0.335  0.4025  1.964  0.776
7   1955.0   3.562000  0.350  0.3961  2.116  0.827
8   1956.0   3.979000  0.361  0.3822  2.435  0.800
9   1957.0   4.420000  0.379  0.3045  2.707  0.921
10  1958.0   4.563000  0.391  0.3284  2.706  1.067
11  1959.0   5.385000  0.426  0.3856  2.846  1.083
12  1960.0   5.554000  0.441  0.3193  3.089  1.481
13  1961.0   5.465000  0.460  0.3079  3.122  1.736
14  1962.0   5.825000  0.485  0.3783  3.184  1.926
15  1963.0   6.876000  0.506  0.4180  3.263  2.041
16  1964.0   7.823000  0.538  0.5163  3.412  1.997
17  1965.0   9.120000  0.564  0.5879  3.623  2.257
18  1966.0  10.512000  0.586  0.5369  4.074  2.742
19  1967.0  13.020000  0.622  0.4443  4.710  3.564
20  1968.0  15.261000  0.666  0.3052  5.217  4.767
21  1969.0  16.313000  0.731  0.2332  5.569  6.511
22  1970.0  16.002001  0.831  0.1883  5.495  7.627
23  1971.0  15.876000  0.906  0.2023  5.334  8.673
24  1972.0  16.662001  1.000  0.2506  5.345  8.331
25  1973.0  17.014000  1.056  0.2668  5.662  8.557
26  1974.0  19.305000  1.131  0.2664  5.729  9.508
27  1975.0  18.721001  1.247  0.2301  5.722  9.062
28  1976.0  19.250000  1.375  0.3452  5.762  8.262
29  1977.0  20.646999  1.544  0.4508  5.877  7.474
30  1978.0  22.726000  1.703  0.5877  6.108  7.104
31  1979.0  23.618999  1.779  0.5346  6.852  6.874

若要將資料匯出為逗點分隔(CSV)檔案,可以使用 convert_file 函數,而其 delimiter 可以指定欄位分隔字元(預設為逗號):

# 將資料匯出為 CSV 檔案
reader.convert_file("output.csv", delimiter=',')

檔案使用完畢後,呼叫 close 關閉檔案:

# 關閉檔案
reader.close()

sas7bdat_to_csv 指令稿工具

在安裝 Python 的 sas7bdat 模組時,會附帶安裝一個 sas7bdat_to_csv 指令稿工具,它可以用來將 SAS7BDAT 格式的 SAS 檔案轉換為逗點分隔(CSV)檔案:

# 將 SAS7BDAT 檔案轉換為 CSV 檔案
sas7bdat_to_csv airline.sas7bdat
[airline.sas7bdat] ⟶ [airline.csv] wrote 32 of 32 lines

Successfully converted 1 of 1 file

新產生的 CSV 檔案會儲存在 airline.csv。

分類:Python

主要資訊欄

搜尋

近期文章

  • Python 使用 PyAutoGUI 自動操作滑鼠與鍵盤
  • Ubuntu Linux 以 WireGuard 架設 VPN 伺服器教學與範例
  • Linux 網路設定 ip 指令用法教學與範例
  • Windows 使用 TPM 虛擬智慧卡保護 SSH 金鑰教學與範例
  • Linux 以 Shamir’s Secret Sharing 分割保存金鑰教學與範例
  • Linux 以 Cryptsetup、LUKS 加密 USB 隨身碟教學與範例
  • Linux 以 Cryptsetup 與 LUKS 加密磁碟教學與範例
  • Linux 使用 age 簡潔的加密、解密工具使用教學與範例

推薦網站

  • Udemy 線上教學課程
  • Coursera 線上教學課程

關注本站

  • 電子郵件
  • Facebook

公益

  • 家扶基金會
  • 台灣世界展望會
  • Yahoo 奇摩公益
  • igiving 公益網
  • 兒福聯盟

Copyright © 2021 · Office Guide