JS如何实现书签导入导出?我是这么做的

本文最后更新于:9 个月前

目录

前言

依赖

概览

功能实现

FileSystem:

HTMLSystem:

html-config:

写在最后


前言

使用Node做过爬虫的人应该都知道Cheerio.js模块,其快速灵活的机制,使我们只需要了解JQ就可以轻松上手,是在使用node抓取网页数据的过程中不可或缺的一员。

了解了cheerio后,我突发奇想:干脆拿cheerio实现个书签的导入吧,正好可以熟悉一下它的用法,于是早些时候我使用cheerio+node实现了初版的书签导入功能,将浏览器导出的书签通过前端页面上传到服务端,服务端使用cheerio将html解析成JSON文件,通过接口将数据传递到前端。

然而,当时我并不满意,因为就为了一个接口开了一个node服务,是不是有点大材小用了,我能否靠本地缓存实现一个纯前端的书签预览和导入导出功能?

说干就干,导入书签我借助前端的FileReader类,读取HTML文件,然后再使用cheerio将Dom解析成JSON格式的数据,在前端展示成menu形式;导出书签同样使用cheerio根据JSON数据生成对应的Dom数据,通过URL.createObjectURL新建文件的本地url地址,最后使用a标签下载文件

下面我分享一下完整的实现过程及源码

依赖

  • utils-lib-js模块

  • cheerio模块

  • vite:3.1

  • vue:3.2

  • element-plus:2.0

概览

这个小案例是基于vite搭建的一个vue-3.0的项目,除了layout之外,案例的核心部分是两个类:
FileSystem和HTMLSystem,前者提供下载,文件读取的功能,后者实现了JSON和HTML互转的功能,除此之外其他的都是常见的布局及组件,所以文章重点描述这两大块

功能实现

FileSystem:

  • 读取文件功能,从element-ui的el-upload组件获取到数据后将结果转换成string格式
  • 下载文件功能,给定url下载静态资源
  • 本地文件转静态地址
import type { UploadFile } from "element-plus/es/components/upload/src/upload.type";
import { defer } from "utils-lib-js";
export type readFileType = 'readAsArrayBuffer' | 'readAsBinaryString' | 'readAsDataURL' | 'readAsText'
export declare interface IFileSystem {
    readFile: (file: UploadFile, type?: readFileType, encoding?: string) => Promise<ProgressEvent<FileReader>>
    downloadFile: (url: string, name?: string) => void
    stringToBlobURL: (fileString: string) => string
}
export class FileSystem implements IFileSystem {
    /**
     * @name: 
     * @description: 读取前端上传的文件
     * @param {UploadFile} file 文件
     * @param {readFileType} type 文件类型
     * @param {string} encoding 解码方式
     * @return {Promise<ProgressEvent<FileReader>>}
     */
    readFile(file: UploadFile, type: readFileType = 'readAsText', encoding: string = 'utf-8') {
        const { promise, resolve, reject } = defer()
        const reader: FileReader = new FileReader();
        reader[type](file.raw, encoding)
        reader.onload = resolve
        reader.onerror = reject
        return <Promise<any>>promise
    }
    /**
     * @name: 
     * @description: 下载文件
     * @param {string} url 资源目录/网址
     * @param {string} name 下载文件名
     * @return {*}
     */
    downloadFile(url: string, name: string = 'file.txt') {
        const link = document.createElement('a')
        link.href = url
        link.download = name
        const _evt = new MouseEvent('click')
        link.dispatchEvent(_evt)
    }
    /**
     * @name: 
     * @description: 字符串转本地文件
     * @param {string} fileString 文件内容
     * @return {*}
     */
    stringToBlobURL(fileString: string) {
        return URL.createObjectURL(new Blob([fileString], { type: "application/octet-stream" }))
    }
}

HTMLSystem:

  • HTML转JSON函数,解析dom树,生成JSON数据
  • JSON转HTML函数,通过标准格式生成书签格式的HTML标签

import { load, Cheerio, CheerioAPI, CheerioOptions } from 'cheerio'
import {
    createHtmlFolder,
    createHtmlFile,
    createBaseTemp
} from '@/config'
import { File, Folder } from "@/layout/menu/types";
export declare interface IHTMLSystem<F = Folder | File, T = Cheerio<any>, I = CheerioAPI, FolderList = Array<F>> {
    count: number
    resetCount: () => void
    initHTML: (html: string) => FolderList
    htmlToJson: (node: T, bookMarks: FolderList) => void
    addToBookMarks: (node: T, list: FolderList) => unknown
    getNodeTitle: (node: T) => void
    getNodeInfo: (node: T, info: File) => File
    createInitHtml: (temp: string, opt?: CheerioOptions, isDoc?: boolean) => I
    initJSON: (json: FolderList) => string
    jsonToHtml: (bookMarks: FolderList, node: I) => string
    createFolder: (folder: Folder, node: T) => I
    createFile: (file: File, node: T) => I
    createElemChild: (node: T) => (it: F, i: number) => void
    checkIsFileOrFolder: (item: F) => 'folder' | 'file' | 'none'
}
export class HTMLSystem implements IHTMLSystem {
    count = 0
    /**
     * @name: 
     * @description: 重置id
     * @return {*}
     */
    resetCount = () => {
        this.count = 0;
    };
    /**
     * @name: 
     * @description: 递增id
     * @return {*}
     */
    addCount = () => {
        return this.count++
    };
    /**
     * @name: 
     * @description: 初始化html生成器
     * @param {string} html 预加载的html字符文件
     * @return {Array<Folder | File>}
     */
    initHTML(html: string) {
        const $ = load(html);
        const dl = $("dl").first();
        const dt = dl.children("dt").eq(0);
        return this.htmlToJson(dt, []);
    }

    /**
     * @name: 
     * @description: html转Json的递归函数
     * @param {Cheerio} node 根节点
     * @param {Array} bookMarks JSON数据源
     * @return {Array<Folder | File>}
     */
    htmlToJson = (node: Cheerio<any>, bookMarks: Array<Folder | File> = []) => {
        //下一级文件夹目录列表
        const childrenNodeDL = node.children("dl");
        const childrenNodeDT = childrenNodeDL.children("dt");
        const { item: dir, dirType } = this.addToBookMarks(node, bookMarks)
        childrenNodeDT.map((i) => {
            const it = childrenNodeDT.eq(i)
            dirType === 'file' && this.addToBookMarks(it, dir.children)
            this.htmlToJson(it, dir.children);
        });
        return bookMarks;
    };
    /**
     * @name: 
     * @description: 将单个数据添加到JSON中
     * @param {Cheerio} node 父节点
     * @param {Array} list 书签JSON数据
     * @return {<Folder | File>, Array<Folder | File>, 'folder'|'file'} 
     */
    addToBookMarks = (node: Cheerio<any>, list: Array<Folder | File> = []) => {
        const item = this.getNodeTitle(node);
        const dirType = this.checkIsFileOrFolder(item)
        switch (dirType) {
            case "folder":
                item.children = [];
            case "file":
                item.id = this.addCount().toString()
                list.push(item)
                break;
        }
        return { item, list, dirType }
    }
    /**
     * @name: 
     * @description: 判断单个数据是否是文件夹,并解析详细信息
     * @param {Cheerio} node 文件或文件夹所在的节点
     * @return {*}
     */
    getNodeTitle = (node: Cheerio<any>) => {
        const info: any = {};

        const title = node.children("h3");
        // 如果h3的length为0则不是文件夹,就获取网站名称和网址,否则是文件夹并赋值title, add_date,last_modified
        return title.length === 0 ? this.getNodeInfo(node, info) : {
            ...info,
            title: title.text(),
            add_date: title.attr("add_date"),
            last_modified: title.attr("last_modified")
        };
    };
    /**
     * @name: 
     * @description: 解析书签文件详细信息
     * @param {Cheerio} node 文件所在的节点
     * @return {File}
     */
    getNodeInfo = (node: Cheerio<any>, info: File) => ({
        ...info,
        name: node.children("a").text(),
        href: node.children("a").attr("href") ?? '',
        icon: node.children("a").attr("icon") ?? '',
        add_date: node.children("a").attr("add_date")
    })
    /**
     * @name: 
     * @description: 入口函数
     * @param {Array} json 上面生成的书签JSON文件
     * @return {string}
     */
    initJSON(json: Array<Folder | File>) {
        return this.jsonToHtml(json);
    }
    /**
     * @name: 
     * @description: 生成新标签的CheerioAPI
     * @param {string} temp 标签
     * @param {*} opt Cheerio 配置项
     * @param {*} isDoc 是否生成完整的html标签
     * @return {CheerioAPI}
     */
    createInitHtml = (temp: string, opt = { xml: true, xmlMode: true }, isDoc = false) => {
        const $ = load(temp, opt, isDoc);
        return $
    }
    /**
     * @name: 
     * @description: JSON转书签的主函数
     * @param {Array} bookMarks 书签的JSON数据
     * @return {string}
     */
    jsonToHtml = (bookMarks: Array<Folder | File> = []) => {
        const root = this.createInitHtml(`<div id="root">${createBaseTemp()}</div>`)("#root")
        bookMarks.forEach(this.createElemChild(root.children().first()))
        return root.children().toString()
    }
    /**
     * @name: 
     * @description: 递归生成Dom树
     * @param {Cheerio} node 父节点
     * @return {void}
     */
    createElemChild = (node: Cheerio<any>) => (it: Folder | File) => {
        const type = this.checkIsFileOrFolder(it)
        switch (type) {
            case 'folder':
                const folder = this.createFolder(it as Folder)
                node.append(folder("*"))
                //每次都会获取最后一个标签,将子项放进去,防止标签重复遍历
                it.children.forEach(this.createElemChild(node.children("DL").last()))
                break
            case 'file':
                const file = this.createFile(it as File)
                node.append(file('*'))
                break
            case 'none':
                throw new Error('Item is not Folder or File')
        }
    }
    /**
     * @name: 
     * @description: 生成文件夹标签
     * @param {Folder} folder 文件夹格式的单个数据
     * @return {CheerioAPI}
     */
    createFolder = (folder: Folder) => {
        const init = this.createInitHtml(createHtmlFolder(folder))
        return init
    }
    /**
     * @name: 
     * @description: 生成文件标签
     * @param {File} file 文件格式的单个数据
     * @return {CheerioAPI}
     */
    createFile = (file: File) => {
        const init = this.createInitHtml(createHtmlFile(file))
        return init
    }
    /**
     * @name: 
     * @description: 判断是文件还是文件夹格式的数据
     * @param {Folder} item 单个数据
     * @return {*}
     */
    checkIsFileOrFolder = (item: Folder | File) => item.title ? 'folder' : item.name ? 'file' : 'none'

}

html-config:

此外,生成HTML时,需要一些模板函数


import { File, Folder } from "@/layout/menu/types";
/**
 * @name: 
 * @description: 书签默认模板
 * @param {string} 书签名
 * @return {*}
 */
export const createHtmlTemp = (name: string) => `<!DOCTYPE NETSCAPE-Bookmark-file-1>
<!-- This is an automatically generated file.
     It will be read and overwritten.
     DO NOT EDIT! -->
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8">
<TITLE>${name}</TITLE>
<H1>${name}</H1>
`
/**
 * @name: 
 * @description: 生成文件夹格式的Dom
 * @param {Folder} folder 文件夹格式数据
 * @return {*}
 */
export const createHtmlFolder = (folder: Folder) => `
<DT/>
<H3 ADD_DATE="${folder.add_date}" LAST_MODIFIED="${folder.last_modified}">${folder.title}</H3>
${createBaseTemp()}
`
/**
 * @name: 
 * @description: 生成文件格式的Dom
 * @param {File} file 文件格式数据
 * @return {*}
 */
export const createHtmlFile = (file: File) => `
<DT/>
<A HREF="${file.href}" ICON="${file.icon}" ADD_DATE="${file.add_date}">${file.name}</A>
`
/**
 * @name: 
 * @description: 列表格式的Dom
 * @return {*}
 */
export const createBaseTemp = () => `
<DL><p>
</DL><p>
`

写在最后

最终实现效果:BookMarks

源码:book_mark: 纯前端导入导出html书签,生成书签导航

最后,感谢你看到这里,如果文章有帮助到你,还请支持一下博主!