2017-09-19 46 views
6

To pytanie było dostępne w Internecie pod adresem asked before, ale nie udało się znaleźć dobrej odpowiedzi.Dlaczego jądro Linux ma `struct sock` i` struct socket`?

Jądro Linux stos sieciowy posiada dwie struktury:

Obie struktury są zasadniczo związane , ale wydaje się, że mają nieco inne okresy życia. Można znaleźć sk poprzez sock->sk lub znaleźć sock przez sk->sk_socket.

Dlaczego istnieją dwie struktury do przechowywania informacji o gniazdach? Zakładając, że muszę dodać nowe pole, kiedy powinienem dodać je do struct socket i kiedy do struct sock?

UPDATE: Należy pamiętać, że odnoszą się do struct socket w include/linux/net.h wewnątrz kodu źródłowego Linuksa, które jest przeznaczone tylko dla kodu jądra i nie/usr/include/sys/socket.h która jest przeznaczona do przestrzeni użytkownika.

+0

Well: istnieje wewnątrz i na zewnątrz. (podobna sytuacja występuje na przykład przy statyce struct) – wildplasser

Odpowiedz

4

wydaje się być interfejsem wyższego poziomu, który jest używany do wywołań systemowych (dlatego ma również wskaźnik do struct file, który reprezentuje tutaj deskryptor pliku).

struct sock jest implementacja z jądra do AF_INET gniazd (nie jest struct unix_sockAF_UNIX do gniazda, które jest pochodną tego), które mogą być wykorzystywane zarówno w jądrze, a w przestrzeni użytkownika (poprzez struct sock).

Obie zostały dodane do Linuksa 1.0 w 1993 roku, wątpię, aby znaleźć dokument określający wstępną decyzję projektową.

+0

Okey, ale dlaczego programiści jądra nie tylko zawierali wszystkie pola w 'struct sock' i' struct unix_sock' bezpośrednio w 'struct socket'? – user1202136

+1

@ user1202136: Ponieważ jest to sprzeczne z [dekompozycja] (https://en.wikipedia.org/wiki/Decomposition_ (computer_science)), co pozwala uprościć wszystko. Zwróć też uwagę, że 'AF_INET' i' AF_UNIX' są tylko 2 z 43 dostępnych typów (zobacz 'AF_MAX'). Natomiast 'struct socket' jest tworzone przez' socket() 'przed stworzeniem dolnej struktury implementacji. – myaut

0

"Te dwie struktury są zasadniczo połączone" - nie wiesz, co masz na myśli.

Chyba można znaleźć odpowiedź, jeśli spojrzeć na pliki źródłowe dla tych struktur:

socket -> linux-src/include/linux/net.h 
sock -> linux-src/include/net/sock.h 

gniazdo

* NET  An implementation of the SOCKET network access protocol. 
*  This is the master header file for the Linux NET layer, 
*  or, in plain English: the networking handling part of the 
*  kernel. 

skarpetka

* INET  An implementation of the TCP/IP protocol suite for the LINUX 
*  operating system. INET is implemented using the BSD Socket 
*  interface as the means of communication with the user level. 

Struktury te są diffe wynajem i ma inną reprezentację abstrakcji gniazda.

Tutaj odpowiedź na temat różnych gniazd.

Unix vs BSD vs TCP vs Internet sockets?

Gdzie zdefiniować dodatkowe pola zależy od intencji. Opisz swoje zadanie.

Proszę spojrzeć na źródła.

linux-src/include/linux/net.h

/* 
* NET  An implementation of the SOCKET network access protocol. 
*  This is the master header file for the Linux NET layer, 
*  or, in plain English: the networking handling part of the 
*  kernel. 
* 
* Version: @(#)net.h 1.0.3 05/25/93 
* 
* Authors: Orest Zborowski, <[email protected]> 
*  Ross Biro 
*  Fred N. van Kempen, <[email protected]> 
* 
*  This program is free software; you can redistribute it and/or 
*  modify it under the terms of the GNU General Public License 
*  as published by the Free Software Foundation; either version 
*  2 of the License, or (at your option) any later version. 
*/ 
..... 
..... 
..... 
/** 
* struct socket - general BSD socket 
* @state: socket state (%SS_CONNECTED, etc) 
* @type: socket type (%SOCK_STREAM, etc) 
* @flags: socket flags (%SOCK_NOSPACE, etc) 
* @ops: protocol specific socket operations 
* @file: File back pointer for gc 
* @sk: internal networking protocol agnostic socket representation 
* @wq: wait queue for several uses 
*/ 
struct socket { 
    socket_state  state; 

    kmemcheck_bitfield_begin(type); 
    short   type; 
    kmemcheck_bitfield_end(type); 

    unsigned long  flags; 

    struct socket_wq __rcu *wq; 

    struct file  *file; 
    struct sock  *sk; 
    const struct proto_ops *ops; 
}; 

linux-src/include/net/sock.h

/* 
* INET  An implementation of the TCP/IP protocol suite for the LINUX 
*  operating system. INET is implemented using the BSD Socket 
*  interface as the means of communication with the user level. 
* 
*  Definitions for the AF_INET socket handler. 
* 
* Version: @(#)sock.h 1.0.4 05/13/93 
* 
* Authors: Ross Biro 
*  Fred N. van Kempen, <[email protected]> 
*  Corey Minyard <[email protected]> 
*  Florian La Roche <[email protected]> 
* 
* Fixes: 
*  Alan Cox : Volatiles in skbuff pointers. See 
*     skbuff comments. May be overdone, 
*     better to prove they can be removed 
*     than the reverse. 
*  Alan Cox : Added a zapped field for tcp to note 
*     a socket is reset and must stay shut up 
*  Alan Cox : New fields for options 
* Pauline Middelink : identd support 
*  Alan Cox : Eliminate low level recv/recvfrom 
*  David S. Miller : New socket lookup architecture. 
*    Steve Whitehouse:  Default routines for sock_ops 
*    Arnaldo C. Melo : removed net_pinfo, tp_pinfo and made 
*       protinfo be just a void pointer, as the 
*       protocol specific parts were moved to 
*       respective headers and ipv4/v6, etc now 
*       use private slabcaches for its socks 
*    Pedro Hortas : New flags field for socket options 
* 
* 
*  This program is free software; you can redistribute it and/or 
*  modify it under the terms of the GNU General Public License 
*  as published by the Free Software Foundation; either version 
*  2 of the License, or (at your option) any later version. 
*/ 
.... 
.... 
.... 
/** 
    * struct sock - network layer representation of sockets 
    * @__sk_common: shared layout with inet_timewait_sock 
    * @sk_shutdown: mask of %SEND_SHUTDOWN and/or %RCV_SHUTDOWN 
    * @sk_userlocks: %SO_SNDBUF and %SO_RCVBUF settings 
    * @sk_lock: synchronizer 
    * @sk_kern_sock: True if sock is using kernel lock classes 
    * @sk_rcvbuf: size of receive buffer in bytes 
    * @sk_wq: sock wait queue and async head 
    * @sk_rx_dst: receive input route used by early demux 
    * @sk_dst_cache: destination cache 
    * @sk_dst_pending_confirm: need to confirm neighbour 
    * @sk_policy: flow policy 
    * @sk_receive_queue: incoming packets 
    * @sk_wmem_alloc: transmit queue bytes committed 
    * @sk_tsq_flags: TCP Small Queues flags 
    * @sk_write_queue: Packet sending queue 
    * @sk_omem_alloc: "o" is "option" or "other" 
    * @sk_wmem_queued: persistent queue size 
    * @sk_forward_alloc: space allocated forward 
    * @sk_napi_id: id of the last napi context to receive data for sk 
    * @sk_ll_usec: usecs to busypoll when there is no data 
    * @sk_allocation: allocation mode 
    * @sk_pacing_rate: Pacing rate (if supported by transport/packet scheduler) 
    * @sk_pacing_status: Pacing status (requested, handled by sch_fq) 
    * @sk_max_pacing_rate: Maximum pacing rate (%SO_MAX_PACING_RATE) 
    * @sk_sndbuf: size of send buffer in bytes 
    * @__sk_flags_offset: empty field used to determine location of bitfield 
    * @sk_padding: unused element for alignment 
    * @sk_no_check_tx: %SO_NO_CHECK setting, set checksum in TX packets 
    * @sk_no_check_rx: allow zero checksum in RX packets 
    * @sk_route_caps: route capabilities (e.g. %NETIF_F_TSO) 
    * @sk_route_nocaps: forbidden route capabilities (e.g NETIF_F_GSO_MASK) 
    * @sk_gso_type: GSO type (e.g. %SKB_GSO_TCPV4) 
    * @sk_gso_max_size: Maximum GSO segment size to build 
    * @sk_gso_max_segs: Maximum number of GSO segments 
    * @sk_lingertime: %SO_LINGER l_linger setting 
    * @sk_backlog: always used with the per-socket spinlock held 
    * @sk_callback_lock: used with the callbacks in the end of this struct 
    * @sk_error_queue: rarely used 
    * @sk_prot_creator: sk_prot of original sock creator (see ipv6_setsockopt, 
    *   IPV6_ADDRFORM for instance) 
    * @sk_err: last error 
    * @sk_err_soft: errors that don't cause failure but are the cause of a 
    *   persistent failure not just 'timed out' 
    * @sk_drops: raw/udp drops counter 
    * @sk_ack_backlog: current listen backlog 
    * @sk_max_ack_backlog: listen backlog set in listen() 
    * @sk_uid: user id of owner 
    * @sk_priority: %SO_PRIORITY setting 
    * @sk_type: socket type (%SOCK_STREAM, etc) 
    * @sk_protocol: which protocol this socket belongs in this network family 
    * @sk_peer_pid: &struct pid for this socket's peer 
    * @sk_peer_cred: %SO_PEERCRED setting 
    * @sk_rcvlowat: %SO_RCVLOWAT setting 
    * @sk_rcvtimeo: %SO_RCVTIMEO setting 
    * @sk_sndtimeo: %SO_SNDTIMEO setting 
    * @sk_txhash: computed flow hash for use on transmit 
    * @sk_filter: socket filtering instructions 
    * @sk_timer: sock cleanup timer 
    * @sk_stamp: time stamp of last packet received 
    * @sk_tsflags: SO_TIMESTAMPING socket options 
    * @sk_tskey: counter to disambiguate concurrent tstamp requests 
    * @sk_zckey: counter to order MSG_ZEROCOPY notifications 
    * @sk_socket: Identd and reporting IO signals 
    * @sk_user_data: RPC layer private data 
    * @sk_frag: cached page frag 
    * @sk_peek_off: current peek_offset value 
    * @sk_send_head: front of stuff to transmit 
    * @sk_security: used by security modules 
    * @sk_mark: generic packet mark 
    * @sk_cgrp_data: cgroup data for this cgroup 
    * @sk_memcg: this socket's memory cgroup association 
    * @sk_write_pending: a write to stream socket waits to start 
    * @sk_state_change: callback to indicate change in the state of the sock 
    * @sk_data_ready: callback to indicate there is data to be processed 
    * @sk_write_space: callback to indicate there is bf sending space available 
    * @sk_error_report: callback to indicate errors (e.g. %MSG_ERRQUEUE) 
    * @sk_backlog_rcv: callback to process the backlog 
    * @sk_destruct: called at sock freeing time, i.e. when all refcnt == 0 
    * @sk_reuseport_cb: reuseport group container 
    * @sk_rcu: used during RCU grace period 
    */ 
struct sock { 
    /* 
    * Now struct inet_timewait_sock also uses sock_common, so please just 
    * don't add nothing before this first member (__sk_common) --acme 
    */ 
    struct sock_common __sk_common; 
#define sk_node   __sk_common.skc_node 
#define sk_nulls_node  __sk_common.skc_nulls_node 
#define sk_refcnt  __sk_common.skc_refcnt 
#define sk_tx_queue_mapping __sk_common.skc_tx_queue_mapping 

#define sk_dontcopy_begin __sk_common.skc_dontcopy_begin 
#define sk_dontcopy_end  __sk_common.skc_dontcopy_end 
#define sk_hash   __sk_common.skc_hash 
#define sk_portpair  __sk_common.skc_portpair 
#define sk_num   __sk_common.skc_num 
#define sk_dport  __sk_common.skc_dport 
#define sk_addrpair  __sk_common.skc_addrpair 
#define sk_daddr  __sk_common.skc_daddr 
#define sk_rcv_saddr  __sk_common.skc_rcv_saddr 
#define sk_family  __sk_common.skc_family 
#define sk_state  __sk_common.skc_state 
#define sk_reuse  __sk_common.skc_reuse 
#define sk_reuseport  __sk_common.skc_reuseport 
#define sk_ipv6only  __sk_common.skc_ipv6only 
#define sk_net_refcnt  __sk_common.skc_net_refcnt 
#define sk_bound_dev_if  __sk_common.skc_bound_dev_if 
#define sk_bind_node  __sk_common.skc_bind_node 
#define sk_prot   __sk_common.skc_prot 
#define sk_net   __sk_common.skc_net 
#define sk_v6_daddr  __sk_common.skc_v6_daddr 
#define sk_v6_rcv_saddr __sk_common.skc_v6_rcv_saddr 
#define sk_cookie  __sk_common.skc_cookie 
#define sk_incoming_cpu  __sk_common.skc_incoming_cpu 
#define sk_flags  __sk_common.skc_flags 
#define sk_rxhash  __sk_common.skc_rxhash 

    socket_lock_t  sk_lock; 
    atomic_t  sk_drops; 
    int   sk_rcvlowat; 
    struct sk_buff_head sk_error_queue; 
    struct sk_buff_head sk_receive_queue; 
    /* 
    * The backlog queue is special, it is always used with 
    * the per-socket spinlock held and requires low latency 
    * access. Therefore we special case it's implementation. 
    * Note : rmem_alloc is in this structure to fill a hole 
    * on 64bit arches, not because its logically part of 
    * backlog. 
    */ 
    struct { 
     atomic_t rmem_alloc; 
     int  len; 
     struct sk_buff *head; 
     struct sk_buff *tail; 
    } sk_backlog; 
#define sk_rmem_alloc sk_backlog.rmem_alloc 

    int   sk_forward_alloc; 
#ifdef CONFIG_NET_RX_BUSY_POLL 
    unsigned int  sk_ll_usec; 
    /* ===== mostly read cache line ===== */ 
    unsigned int  sk_napi_id; 
#endif 
    int   sk_rcvbuf; 

    struct sk_filter __rcu *sk_filter; 
    union { 
     struct socket_wq __rcu *sk_wq; 
     struct socket_wq *sk_wq_raw; 
    }; 
#ifdef CONFIG_XFRM 
    struct xfrm_policy __rcu *sk_policy[2]; 
#endif 
    struct dst_entry *sk_rx_dst; 
    struct dst_entry __rcu *sk_dst_cache; 
    atomic_t  sk_omem_alloc; 
    int   sk_sndbuf; 

    /* ===== cache line for TX ===== */ 
    int   sk_wmem_queued; 
    refcount_t  sk_wmem_alloc; 
    unsigned long  sk_tsq_flags; 
    struct sk_buff  *sk_send_head; 
    struct sk_buff_head sk_write_queue; 
    __s32   sk_peek_off; 
    int   sk_write_pending; 
    __u32   sk_dst_pending_confirm; 
    u32   sk_pacing_status; /* see enum sk_pacing */ 
    long   sk_sndtimeo; 
    struct timer_list sk_timer; 
    __u32   sk_priority; 
    __u32   sk_mark; 
    u32   sk_pacing_rate; /* bytes per second */ 
    u32   sk_max_pacing_rate; 
    struct page_frag sk_frag; 
    netdev_features_t sk_route_caps; 
    netdev_features_t sk_route_nocaps; 
    int   sk_gso_type; 
    unsigned int  sk_gso_max_size; 
    gfp_t   sk_allocation; 
    __u32   sk_txhash; 

    /* 
    * Because of non atomicity rules, all 
    * changes are protected by socket lock. 
    */ 
    unsigned int  __sk_flags_offset[0]; 
#ifdef __BIG_ENDIAN_BITFIELD 
#define SK_FL_PROTO_SHIFT 16 
#define SK_FL_PROTO_MASK 0x00ff0000 

#define SK_FL_TYPE_SHIFT 0 
#define SK_FL_TYPE_MASK 0x0000ffff 
#else 
#define SK_FL_PROTO_SHIFT 8 
#define SK_FL_PROTO_MASK 0x0000ff00 

#define SK_FL_TYPE_SHIFT 16 
#define SK_FL_TYPE_MASK 0xffff0000 
#endif 

    kmemcheck_bitfield_begin(flags); 
    unsigned int  sk_padding : 1, 
       sk_kern_sock : 1, 
       sk_no_check_tx : 1, 
       sk_no_check_rx : 1, 
       sk_userlocks : 4, 
       sk_protocol : 8, 
       sk_type  : 16; 
#define SK_PROTOCOL_MAX U8_MAX 
    kmemcheck_bitfield_end(flags); 

    u16   sk_gso_max_segs; 
    unsigned long   sk_lingertime; 
    struct proto  *sk_prot_creator; 
    rwlock_t  sk_callback_lock; 
    int   sk_err, 
       sk_err_soft; 
    u32   sk_ack_backlog; 
    u32   sk_max_ack_backlog; 
    kuid_t   sk_uid; 
    struct pid  *sk_peer_pid; 
    const struct cred *sk_peer_cred; 
    long   sk_rcvtimeo; 
    ktime_t   sk_stamp; 
    u16   sk_tsflags; 
    u8   sk_shutdown; 
    u32   sk_tskey; 
    atomic_t  sk_zckey; 
    struct socket  *sk_socket; 
    void   *sk_user_data; 
#ifdef CONFIG_SECURITY 
    void   *sk_security; 
#endif 
    struct sock_cgroup_data sk_cgrp_data; 
    struct mem_cgroup *sk_memcg; 
    void   (*sk_state_change)(struct sock *sk); 
    void   (*sk_data_ready)(struct sock *sk); 
    void   (*sk_write_space)(struct sock *sk); 
    void   (*sk_error_report)(struct sock *sk); 
    int   (*sk_backlog_rcv)(struct sock *sk, 
          struct sk_buff *skb); 
    void     (*sk_destruct)(struct sock *sk); 
    struct sock_reuseport __rcu *sk_reuseport_cb; 
    struct rcu_head  sk_rcu; 
}; 
+0

Kopiowanie wklejania kodu źródłowego bez objaśnień naprawdę nie jest pomocne. Nie mam konkretnego zadania, chcę tylko zrozumieć logikę. Na przykład dlaczego 'sk_uid' w' struct sock', a nie 'struct socket'? – user1202136