• 跳至主要導覽
  • 跳至主要內容
  • 跳至主要資訊欄
Office 指南

Office 指南

辦公室工作實用教學

  • Excel
  • Word
  • PowerPoint
  • Windows
  • PowerShell
  • R

Python 讀取 SAS7BDAT 格式 SAS 檔案教學與範例

介紹如何在 Python 中使用 sas7bdat 模組讀取 SAS7BDAT 格式的 SAS 檔案。

相關文章:
R 讀取 SAS7BDAT 格式 SAS 檔案教學與範例

安裝 sas7bdat 模組

Python 的 sas7bdat 模組可以透過 pip 指令來安裝:

# 安裝 sas7bdat 模組
pip install sas7bdat

讀取 SAS7BDAT 格式 SAS 檔案

這裡我們以 airline.sas7bdat 這個檔案來示範如何使用 Python 的 sas7bdat 模組來讀取 SAS7BDAT 格式的 SAS 檔案。

首先以 sas7bdat 模組的 SAS7BDAT 函數開啟 SAS7BDAT 檔案:

# 引入 sas7bdat 模組的 SAS7BDAT 函數
from sas7bdat import SAS7BDAT

# 開啟 SAS7BDAT 格式的 SAS 檔案
reader = SAS7BDAT('airline.sas7bdat')

reader.header 中包含了詳細的標頭資訊:

# 標頭資訊
print(reader.header)
Header:
        col_count_p1: 6
        col_count_p2: 0
        column_count: 6
        compression: None
        creator: Written by SAS
        creator_proc: None
        date_created: 2008-05-13 15:25:11
        date_modified: 2008-05-13 15:25:11
        endianess: little
        file_type: DATA
        filename: airline.sas7bdat
        header_length: 1024
        lcp: 0
        lcs: 16
        mix_page_row_count: 32
        name: AIRLINE
        os_name:
        os_type:
        page_count: 1
        page_length: 4096
        platform: windows
        row_count: 32
        row_length: 44
        sas_release: 9.0000M0
        server_type: WIN
        u64: False

Contents of dataset "AIRLINE":
Num Name    Type   Length Format Label
--- ------- ------ ------ ------ ------------------
  1 YEAR    number      4        year
  2 Y       number      8        level of output
  3 W       number      8        wage rate
  4 R       number      8        interest rate
  5 L       number      8        labor input
  6 K       number      8        capital input

透過 reader.columns 可以取得 SAS 變數的屬性(variable attributes):

# 取得變數屬性(Variable Attributes)
for c in reader.columns:
    print("col_id:", c.col_id)
    print("  name:", c.name.decode(encoding = "utf-8"))
    print("  label:", c.label.decode(encoding = "utf-8"))
    print("  format:", c.format)
    print("  type:", c.type)
    print("  length:", c.length)
col_id: 0
  name: YEAR
  label: year
  format:
  type: number
  length: 4
col_id: 1
  name: Y
  label: level of output
  format:
  type: number
  length: 8
col_id: 2
  name: W
  label: wage rate
  format:
  type: number
  length: 8
col_id: 3
  name: R
  label: interest rate
  format:
  type: number
  length: 8
col_id: 4
  name: L
  label: labor input
  format:
  type: number
  length: 8
col_id: 5
  name: K
  label: capital input
  format:
  type: number
  length: 8

實際的資料內容則可透過 reader 取得:

# 讀取資料
for r in reader:
    print(r)
['YEAR', 'Y', 'W', 'R', 'L', 'K']
[1948.0, 1.2139999866485596, 0.24300000071525574, 0.1454000025987625, 1.4149999618530273, 0.6119999885559082]
[1949.0, 1.3539999723434448, 0.25999999046325684, 0.21809999644756317, 1.3839999437332153, 0.5590000152587891]
[1950.0, 1.569000005722046, 0.27799999713897705, 0.3156999945640564, 1.3880000114440918, 0.5730000138282776]
[1951.0, 1.9479999542236328, 0.296999990940094, 0.39399999380111694, 1.5499999523162842, 0.5640000104904175]
[1952.0, 2.265000104904175, 0.3100000023841858, 0.35589998960494995, 1.8020000457763672, 0.5740000009536743]
[1953.0, 2.7309999465942383, 0.32199999690055847, 0.35929998755455017, 1.9259999990463257, 0.7110000252723694]
[1954.0, 3.0250000953674316, 0.33500000834465027, 0.4025000035762787, 1.9639999866485596, 0.7760000228881836]
[1955.0, 3.562000036239624, 0.3499999940395355, 0.3961000144481659, 2.115999937057495, 0.8270000219345093]
[1956.0, 3.9790000915527344, 0.3610000014305115, 0.3822000026702881, 2.434999942779541, 0.800000011920929]
[1957.0, 4.420000076293945, 0.3790000081062317, 0.304500013589859, 2.7070000171661377, 0.9210000038146973]
[1958.0, 4.563000202178955, 0.39100000262260437, 0.32839998602867126, 2.7060000896453857, 1.0670000314712524]
[1959.0, 5.385000228881836, 0.4259999990463257, 0.3856000006198883, 2.8459999561309814, 1.0829999446868896]
[1960.0, 5.553999900817871, 0.4410000145435333, 0.31929999589920044, 3.0889999866485596, 1.4809999465942383]
[1961.0, 5.465000152587891, 0.46000000834465027, 0.30790001153945923, 3.121999979019165, 1.7359999418258667]
[1962.0, 5.824999809265137, 0.48500001430511475, 0.3783000111579895, 3.184000015258789, 1.9259999990463257]
[1963.0, 6.875999927520752, 0.5059999823570251, 0.4180000126361847, 3.263000011444092, 2.0409998893737793]
[1964.0, 7.822999954223633, 0.5379999876022339, 0.5163000226020813, 3.4119999408721924, 1.996999979019165]
[1965.0, 9.119999885559082, 0.5640000104904175, 0.5878999829292297, 3.622999906539917, 2.256999969482422]
[1966.0, 10.51200008392334, 0.5860000252723694, 0.536899983882904, 4.073999881744385, 2.742000102996826]
[1967.0, 13.020000457763672, 0.621999979019165, 0.44429999589920044, 4.710000038146973, 3.563999891281128]
[1968.0, 15.26099967956543, 0.6660000085830688, 0.3052000105381012, 5.2170000076293945, 4.767000198364258]
[1969.0, 16.312999725341797, 0.7310000061988831, 0.23319999873638153, 5.568999767303467, 6.511000156402588]
[1970.0, 16.00200080871582, 0.8309999704360962, 0.1882999986410141, 5.494999885559082, 7.626999855041504]
[1971.0, 15.87600040435791, 0.906000018119812, 0.20229999721050262, 5.334000110626221, 8.67300033569336]
[1972.0, 16.66200065612793, 1.0, 0.25060001015663147, 5.34499979019165, 8.331000328063965]
[1973.0, 17.013999938964844, 1.055999994277954, 0.2667999863624573, 5.6620001792907715, 8.557000160217285]
[1974.0, 19.30500030517578, 1.13100004196167, 0.266400009393692, 5.729000091552734, 9.508000373840332]
[1975.0, 18.72100067138672, 1.246999979019165, 0.23010000586509705, 5.7220001220703125, 9.062000274658203]
[1976.0, 19.25, 1.375, 0.3452000021934509, 5.76200008392334, 8.26200008392334]
[1977.0, 20.64699935913086, 1.5440000295639038, 0.45080000162124634, 5.876999855041504, 7.473999977111816]
[1978.0, 22.72599983215332, 1.7029999494552612, 0.5877000093460083, 6.107999801635742, 7.104000091552734]
[1979.0, 23.618999481201172, 1.7790000438690186, 0.534600019454956, 6.8520002365112305, 6.874000072479248]

若習慣以 Pandas 來處理資料的話,亦可直接將 SAS7BDAT 檔案的內容轉為 Pandas 的 DataFrame 資料格式:

# 將資料直接轉換為 Pandas DataFrame
df = reader.to_data_frame()
df
      YEAR          Y      W       R      L      K
0   1948.0   1.214000  0.243  0.1454  1.415  0.612
1   1949.0   1.354000  0.260  0.2181  1.384  0.559
2   1950.0   1.569000  0.278  0.3157  1.388  0.573
3   1951.0   1.948000  0.297  0.3940  1.550  0.564
4   1952.0   2.265000  0.310  0.3559  1.802  0.574
5   1953.0   2.731000  0.322  0.3593  1.926  0.711
6   1954.0   3.025000  0.335  0.4025  1.964  0.776
7   1955.0   3.562000  0.350  0.3961  2.116  0.827
8   1956.0   3.979000  0.361  0.3822  2.435  0.800
9   1957.0   4.420000  0.379  0.3045  2.707  0.921
10  1958.0   4.563000  0.391  0.3284  2.706  1.067
11  1959.0   5.385000  0.426  0.3856  2.846  1.083
12  1960.0   5.554000  0.441  0.3193  3.089  1.481
13  1961.0   5.465000  0.460  0.3079  3.122  1.736
14  1962.0   5.825000  0.485  0.3783  3.184  1.926
15  1963.0   6.876000  0.506  0.4180  3.263  2.041
16  1964.0   7.823000  0.538  0.5163  3.412  1.997
17  1965.0   9.120000  0.564  0.5879  3.623  2.257
18  1966.0  10.512000  0.586  0.5369  4.074  2.742
19  1967.0  13.020000  0.622  0.4443  4.710  3.564
20  1968.0  15.261000  0.666  0.3052  5.217  4.767
21  1969.0  16.313000  0.731  0.2332  5.569  6.511
22  1970.0  16.002001  0.831  0.1883  5.495  7.627
23  1971.0  15.876000  0.906  0.2023  5.334  8.673
24  1972.0  16.662001  1.000  0.2506  5.345  8.331
25  1973.0  17.014000  1.056  0.2668  5.662  8.557
26  1974.0  19.305000  1.131  0.2664  5.729  9.508
27  1975.0  18.721001  1.247  0.2301  5.722  9.062
28  1976.0  19.250000  1.375  0.3452  5.762  8.262
29  1977.0  20.646999  1.544  0.4508  5.877  7.474
30  1978.0  22.726000  1.703  0.5877  6.108  7.104
31  1979.0  23.618999  1.779  0.5346  6.852  6.874

若要將資料匯出為逗點分隔(CSV)檔案,可以使用 convert_file 函數,而其 delimiter 可以指定欄位分隔字元(預設為逗號):

# 將資料匯出為 CSV 檔案
reader.convert_file("output.csv", delimiter=',')

檔案使用完畢後,呼叫 close 關閉檔案:

# 關閉檔案
reader.close()

sas7bdat_to_csv 指令稿工具

在安裝 Python 的 sas7bdat 模組時,會附帶安裝一個 sas7bdat_to_csv 指令稿工具,它可以用來將 SAS7BDAT 格式的 SAS 檔案轉換為逗點分隔(CSV)檔案:

# 將 SAS7BDAT 檔案轉換為 CSV 檔案
sas7bdat_to_csv airline.sas7bdat
[airline.sas7bdat] ⟶ [airline.csv] wrote 32 of 32 lines

Successfully converted 1 of 1 file

新產生的 CSV 檔案會儲存在 airline.csv。

分類:Python

讀者互動方式

發佈留言 取消回覆

發佈留言必須填寫的電子郵件地址不會公開。 必填欄位標示為 *

主要資訊欄

搜尋

近期文章

  • hashcat 密碼雜湊快速破解工具使用教學與範例
  • macOS 使用 GPGTools 與 GPG Mail 對郵件加密、解密、簽章、驗證教學與範例
  • wrk:HTTP 網頁伺服器效能測試工具使用教學
  • C++ 語言使用 Crypto++ 實作 RSA 數位簽章教學與範例
  • C++ 語言使用 Crypto++ 實作 RSA-OAEP 搭配 SHA256 加密教學與範例
  • C++ 語言使用 Crypto++ 實作 AES 加密、解密、認證加密教學與範例
  • C++ 語言使用 Crypto++ 實作 MD5、SHA1、SHA2、BLAKE2 雜湊教學與範例
  • Ubuntu Linux 安裝、使用 Crypto++ 加密函式庫教學與範例

推薦網站

  • Udemy 線上教學課程
  • Coursera 線上教學課程

關注本站

  • 電子郵件
  • Facebook

公益

  • 家扶基金會
  • 台灣世界展望會
  • Yahoo 奇摩公益
  • igiving 公益網
  • 兒福聯盟

Copyright © 2021 · Office Guide